• No results found

A suggestion on improving human migration models through incorporation of regional identities

N/A
N/A
Protected

Academic year: 2021

Share "A suggestion on improving human migration models through incorporation of regional identities"

Copied!
85
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

A suggestion on improving

human migration models

through incorporation of

regional identities

Willem R. J. Vermeulen BSc

August 25, 2018

(2)
(3)

Abstract

Human migration is a complex phenomenon, on which many different factors have their influence. The migration process is modelled using gravity models or radiation models. In this paper we propose that human migration models can be improved by embedding regional identities into the model. This is tested by adding three different sets of Dutch identity regions to a gravity model. Through analysis of the Dutch internal migration data between 1996 and 2016, we discover that individuals are more likely to move towards municipalities located within the same identity region. We also find that this influence of identity becomes larger when the identity regions are as smaller and as well defined as possible.

(4)
(5)

Contents

Acknowledgements xii

1 Introduction 1

1.1 Research question and hypotheses . . . 1

1.1.1 Introducing regional identities to a human migration model increases its predictive value . . . 2

1.1.2 Distance is handled differently when regional identities are introduced . . 2

1.1.3 Smaller regional identities have a higher predictive power than larger iden-tities . . . 2

2 Methodologies 3 2.1 Model specification . . . 3

2.2 Used migration data . . . 4

2.3 Fitting a standard gravity model for human migration . . . 4

2.4 Expansion of the gravity model . . . 5

2.4.1 Specification of regional identities . . . 6

2.4.2 Introduction of the different sets of identity regions . . . 7

2.5 Comparison of the importance of identity in different sets of regions . . . 7

2.6 Creation of other sets of identity regions . . . 8

2.6.1 Randomly generated regions . . . 8

2.6.2 Randomly generated spatially clustered regions . . . 8

2.7 Optimisation of a set of regions . . . 9

3 The significance of the influence of the specified identity regions 13 3.1 Differences in the mean ICM values . . . 14

3.2 Differences in the median ICM values . . . 17

(6)

A Influences on migration decisions III

A.1 Considerations in migration decisions . . . III A.2 Economic benefits . . . III A.3 Availability of amenities . . . IV A.4 Travel distance . . . V A.5 Information distance . . . V A.6 Social distance . . . VI A.7 Household optimisation . . . VII A.8 Family-cycle considerations . . . VII A.9 Policies & disasters . . . VIII

B Specification of identity regions IX

C Tactics to increase the average ICM value XVII

C.1 Using network metrics . . . XVII C.2 Similarity of migration behaviour . . . XVIII C.3 Reassigning municipalities . . . XVIII C.4 Simulated annealing . . . XIX C.5 Discussion . . . XIX

(7)

List of Figures

2.1 Visual representation of the steps taken to assign municipalities to spatially clus-tered regions using the k-means algorithm. This algorithm can be applied to generate different numbers of regions. This number of regions is controlled by the variable k. To be able to compare the generated regions with a certain set of identity regions, this k is set equal to the number of regions present in this set of identity regions. . . 9 2.2 A visual representation of the optimisation algorithm. Given a certain starting

configuration, it is determined what regions are located within a distance of 20 kilometres from each existing municipality. For each of these municipalities, the change in the global average ICM value is measured when a municipality would be part of that region. Every municipality that should be part of another region than it already was, is then relocated with a chance of 50%. This process is repeated until no more municipalities are relocated for three iterations. . . 11 3.1 The mean ICM values for 250 sets of twelve randomly generated regions, 250 sets of

twelve randomly generated spatially clustered regions, fifty sets of twelve optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 2 regions, compared to the ICM value of the original NUTS 2 regions. . . 14 3.2 Whatever changes in parameters we make to the human migration model, the

mean value of the ICM values of the randomly generated spatially clustered regions is always significantly lower than the ICM value of the NUTS 2 regions. For each of the parameter configurations, thirty different randomly generated spatially clustered regions were generated. . . 14 3.3 The mean ICM values for 250 sets of forty randomly generated regions, 250 sets of

forty randomly generated spatially clustered regions, fifty sets of forty optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 3 regions, compared to the ICM value of the original NUTS 3 regions. . . 15 3.4 Whatever changes in parameters we make to the human migration model, the

(8)

3.7 The median ICM values for 250 sets of twelve randomly generated regions, 250 sets of twelve randomly generated spatially clustered regions, fifty sets of twelve opti-mised randomly generated spatially clustered regions and fifty sets of optiopti-mised NUTS 2 regions, compared to the ICM value of the original NUTS 2 regions. . . 17 3.8 The median ICM values for 250 sets of forty randomly generated regions, 250 sets

of forty randomly generated spatially clustered regions, fifty sets of forty optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 3 regions, compared to the ICM value of the original NUTS 3 regions. . . 18 3.9 The median ICM values for 250 sets of seventy randomly generated regions, 250

sets of seventy randomly generated spatially clustered regions, fifty sets of sev-enty optimised randomly generated spatially clustered regions and fifty sets of optimised literature regions, compared to the ICM value of the original regions specified through literature. . . 18 3.10 Differences between the mean ICM values of the optimised randomly spatially

clustered regions and the mean ICM values of the optimised predefined regions for each predefined set of regions, using three different distance cut-offs in the optimisation algorithm. For each of the parameter configurations, thirty different optimised regions were generated. . . 19 3.11 Differences between the median ICM values of median ICM values of the optimised

randomly spatially clustered regions and the optimised predefined regions for each predefined set of regions, using three different distance cut-offs in the optimisation algorithm. For each of the parameter configurations, thirty different optimised regions were generated. . . 20 4.1 Three different approaches to defining identity region: with hard boundaries, fuzzy

boundaries, or by looking at the existing connections between two municipalities. 25 D.1 The ICM values calculated for each municipality when the Netherlands are split

into the NUTS 2 regions. All ICM values are positive. The ICM values in the southern part of Limburg, Zeeland, and the Northern parts of Friesland and Groningen are all larger than the ICM values in other parts of the country. The average ICM value for municipalities located within these twelve regions is 20.91, the median ICM value is 12.35. Municipal boundary data used in this map is acquired from the Basisregistratie Kadaster (2016). . . XXII D.2 The ICM values calculated for each municipality when the Netherlands are split

into the NUTS 3 regions. There are municipalities scattered throughout the coun-try with relatively high ICM values. The average ICM value for municipalities lo-cated within these forty regions is 59.57, the median ICM value is 33.03. The lower ICM values are clustered around the centre of the country. Municipal boundary

data used in this map is acquired from the Basisregistratie Kadaster (2016). . . . XXIII D.3 The ICM values calculated for each municipality when the Netherlands are split

into the seventy regions specified by literature. The average ICM value for mu-nicipalities located within these regions is 73.91, the median ICM value is 42.34. Most ICM values are positive, but clusters of lower ICM values are found in North-Holland and Utrecht. The ICM values of Texel, Vlieland and the Wijde-meren municipalities are even negative. Municipal boundary data used in this

map is acquired from the Basisregistratie Kadaster (2016). . . XXIV D.4 The ICM values for each municipality in four different scenarios in which the

Netherlands are split into twelve random regions, disregarding any distance. In each of these scenarios about half of the municipalities have a negative ICM value, and half of the municipalities has a positive ICM value. . . XXV

(9)

D.5 The ICM values for each municipality in four different scenarios in which the Netherlands are split into twelve randomly spatially clustered regions. When the ICM values in these scenarios are compared to the ICM values created by the original regions, it becomes apparent that the ICM values of some municipalities become negative in the randomly generated spatially clustered regions, whereas

all ICM values in the original regions were positive. . . XXVI D.6 The ICM values for each municipality in four different scenarios in which the

Netherlands are split into twelve randomly spatially clustered regions, and then further optimised. As this optimisation technique is based on chance, different optima are found. When the generated ICM values are compared to the ICM values of the original regions, we see that the ICM values are not distributed in a similar way. Whereas the variance in the ICM values in the original regions is very low, we see that there occur various high ICM values and negative ICM values in

the randomly generated spatially clustered regions. . . XXVII D.7 The ICM values for each municipality in four different scenarios in which the

NUTS 2 regions are further optimised. Since this optimisation technique is par-tially based on chance, different optima are found. When the generated ICM values are compared to the ICM values of the original regions, we find that the ICM values are not distributed in a similar way. Whereas the variance in the ICM values in the original regions is very low, we see that there occur various high ICM values and some negative ICM values in the randomly generated spatially clus-tered regions. When compared to the ICM values of the non-optimised randomly spatially clustered regions, we do however find that the number of negative ICM

values is decreased. . . XXVIII D.8 The ICM values for each municipality in four different scenarios in which the

Netherlands are split into forty random regions, disregarding any distance. In each of these scenarios most municipalities have a negative ICM value. The other 15% of the regions have slightly positive ICM values. In each of the four situations,

we find that there are extremely positive and extremely negative ICM values. . . XXIX D.9 The ICM values for each municipality in four different scenarios in which the

Netherlands are split into forty randomly spatially clustered regions. As opposed to the municipalities in the original NUTS 3 regions, municipalities within these randomly spatially clustered regions have more negative ICM values. . . XXX D.10 The ICM values for each municipality in four different scenarios in which the

Netherlands are split into forty randomly spatially clustered regions, and then further optimised. As this optimisation technique is based on chance, different optima are found. Most municipalities have positive ICM values, except for the two cases the municipalities on the Frisian islands had negative ICM values, as

well as the one case a municipality in Friesland had a negative ICM value. . . XXXI D.11 The ICM values for each municipality in four different scenarios in which the NUTS

(10)

D.13 The ICM values for each municipality in four different scenarios in which the Netherlands are split into seventy randomly spatially clustered regions. en the ICM values in these scenarios are compared to the ICM values of the original regions, it becomes apparent that the ICM values in a lot of municipalities are actually higher than they were before. On the other hand, more municipalities then do have negative ICM values. This pattern could be explained by the fact that municipalities in certain parts of the Netherlands are larger than in others. When the regional centres are spread in an equal way over the country by using the k-means algorithm, this means that municipalities that should belong to the

same larger identity region are less likely to be assigned to the same region. . . XXXIV D.14 The ICM values for each municipality in four different scenarios in which the

Netherlands are split into seventy randomly spatially clustered regions, and then further optimised. As this optimisation technique is based on chance, different optima are found. Even though most ICM values are positive, each of the four different scenarios contains at least one municipality with a negative ICM value. On the Frisian islands, and in one municipality in Zeeland these negative ICM values appear more than once. When the ICM values of the optimised regions are compared to the ICM values of the original randomly spatially clustered regions,

we find that the number of municipalities with negative ICM values has decreased. XXXV D.15 The ICM values for each municipality in four different scenarios in which the

sev-enty regions specified by literature are further optimised. As this optimisation technique is based on chance, different optima are found. In three of the four scenarios all municipalities have positive ICM values, in one scenario three mu-nicipalities have negative ICM values. When the ICM values of the optimised regions are compared to the ICM values of the municipalities located in the origi-nal regions, it becomes clear that ICM values of municipalities located all over the

(11)

List of Tables

2.1 Sources for the values of the different variables used in the regression. . . 5 2.2 Different parameters used in the extended gravity model . . . 6 2.3 Different sets of identity regions embedded in the human migration model. . . 6 B.1 Dutch municipalities that existed in 2016, split into 70 different identity regions. IX

(12)

A special thanks to Rick Quax, Debraj Roy and Wessel Klijnsma for your advice and insights, and to my family and girlfriend for

(13)

CHAPTER 1

Introduction

Human migration decisions are based on a broad spectrum of factors. This makes it difficult to accurately predict human migration behaviour. It has been discovered that the choice to move to a certain destination is likely influenced by its economic prospects (Smith, 1776; Peters, 1984; J. Kok, 2004), the availability of amenities (Tiebout, 1956; P. Graves and Linneman, 1979), and the travel distance (Ravenstein, 1885; Ravenstein, 1889; Grigg, 1977; Peters, 1984), information distance (P. Bouman and W. Bouman, 1967; J. Kok, 2004) and social distance (Ravenstein, 1876; Ravenstein, 1885; Weber, 1899; Hipp et al., 2012) between both locations (for more de-tails, see Appendix A). The way in which these variables influence the migration decision differs for each individual. After all, each individual has different personal connections (Bauer and Zimmermann, 1997; Massey, 2015), requirements (Harts and Hingtsman, 1986; J. Kok, 2004) and aspirations (Greenwood, 1985; Lucassen, 2000).

It is important to governments and companies to be able to predict human migration flows beforehand, because this can help them plan ahead. These predictions can be made using dif-ferent types of models. A popular type of migration models are the gravity models, which were first introduced by Zipf (1946) and applied by many others since (Greenwood, 1985; Cummins, 2009; Anderson, 2011). In these models interactions between different locations are specified as a direct function of their mutual geographical distance and their population mass as proxy to the economic prospects of a location. Another popular new type of migration models are the radiation models, which were introduced by Simini et al. (2012) and applied and extended since (Yang et al., 2014; Ren et al., 2014; Kang et al., 2015). In these models residents create inter-vening opportunities for migrants, which means that geographical distance only has an indirect influence on the generated migration flows. Both types of models can come in different forms. A good example of this variety in models is an artificial neural network model that included both the traditional variables, as well as intervening variables and amenity variables (Robinson and

(14)

H3 Smaller, more specific regional identities can explain the anomalies in local migration

be-haviour better than larger, more generic regional identities.

1.1.1

Introducing regional identities to a human migration model increases its

pre-dictive value

As described before, human migration is based on many factors. A basic gravity model for human migration disregards a lot of these factors, and only takes the distance and population size factors into account. While the availability of amenities and economic prospects of a region could correlate with the number of people living somewhere, and the geographical distance could correlate with the travel distance and information distance, this also means that social distance is not included in this model.

The similarities and differences in identity are an important part of the experienced social dis-tance. Even though identity is a complex concept, we thus hypothesise that introducing regional identities into the human migration model can at least partially account for a missing factor in the model.

1.1.2

Distance is handled differently when regional identities are introduced

When regional identity is introduced to the human migration model, we hypothesise that this will have an effect on the way distance is handled. Most migration movements that also involve regional identities will take place over shorter distances. If this means that migration numbers over shorter distances are increased towards municipalities that share the same regional identity, this would also mean that the distance equation would initially have be fitted to contaminated data. Short distance migration flows might have been bigger, but not only because of the small distance between both locations.

1.1.3

Smaller regional identities have a higher predictive power than larger identities

When the specified regional identities are smaller, we hypothesise that these regions can explain anomalies in local migration behaviour better than larger, more generic regional identities. Larger regional identities might actually consist of many smaller identities that are combined together. We would thus hypothesise that larger regional identities would still have some predictive power, but that the regions are actually inaccurate.

(15)

CHAPTER 2

Methodologies

In this chapter we will first specify a basic gravity model for human migration, after which this model is fitted on Dutch data collected between 1996 and 2016. Using three different sets of identity regions we specify, we introduce the identity regions to the gravity model. After creating this model, we define a way of comparing the influence of identity on migration for different identity sets, and define ways to create and optimise the definitions of such regional identities.

2.1 Model specification

As mentioned in the introduction, there are two types of models that are often used to model migration. Even though radiation models seem to work slightly better on a larger scale than gravity models, a gravity model is used in this research. The decisive factor is that a gravity model explicitly uses the distance variable, whereas the radiation model uses the distances vari-able indirectly (Piovani et al., 2018). The influence of distance on the migration process and the impact of the introduction of identity regions on that distance variable would be otherwise be hard to determine.

The most basic gravity model often used to model human migration is shown in Equation 2.1. Within this equation the populations pa and pb of municipalities a and b are positively related

to the number of people that migrate from municipality a to b, Ma→b. When more people live

in municipality a, a larger number of people can leave that location, and when there are more people living in municipality b, it is likely that there are more opportunities in that municipality (Weber, 1899; J. Kok, 2004). This could make people more willing to move there.

(16)

2.2 Used migration data

Part of the internal migration events within the Netherlands take place within municipalities, whereas the other events take place between two different municipalities. This means that both intramunicipal and intermunicipal migration data should be used when we want to include all migration events. Such migration data is available for the years 1995 to 2016 via Statistics Netherlands1.

The migration data is collected directly from the Dutch civil registration database. This means that the data is as accurate as possible, because only cases in which human migration movements are not correctly registered are excluded. Even though no data is available on the percentage of unregistered movements, this shortcoming is not expected to have a large effect on the outcomes of this research: research by Bouhuijs and Meijer (2017) showed that in 2016 96.26% (95% CI [95.72%, 96.81%]) of the Dutch citizens was registered at the right address.

For each year the number of intermunicipal migrants between every combination of munici-palities is recorded, as well as the number of intramunicipal migrants. This does not mean that data for the same number of municipalities is recorded each year: because of merges of munic-ipalities, the number of municipalities has decreased from 625 in 1996 to 390 in 2016 (Centraal Bureau voor de Statistiek, 2018b; Centraal Bureau voor de Statistiek, 2018c). To be able to create maps and compare migration data in different years, we artificially merge municipalities to form the municipalities that existed in 2016.

2.3 Fitting a standard gravity model for human migration

To be able to fit the gravity model to the migration data, we also need to have data on the population size of all municipalities and data on the distances between all locations. While mu-nicipal population data can easily be acquired through Statistics Netherlands (Centraal Bureau voor de Statistiek, 2018a), it is harder to acquire data on the distance travelled by each individual. The distance travelled by every migrant between the same two municipalities will always slightly differ. Migrants do not live at the same location in both municipalities. Because it is impos-sible to know the exact migration distance for each migration event, the distance travelled is approximated by the length of the straight line between the geographical centres of both mu-nicipalities in kilometres. This can be done because this distance is highly correlated with the travel time between two locations (Phibbs and Luft, 1995). Even though there are cases where this assumption does not hold, such as when certain geographical boundaries cannot be crossed or the population centre is located far from the geographical centre of a municipality, we assume this does not have a significant impact.

This way of approximating the distance travelled cannot be used for the intramunicipal mi-gration data. The distance between the centre of a municipality and the centre of that very same municipality is always zero. Under the assumption that most migration movements take place over shorter distances, we estimate the intramunicipal travel distance to be a about

1 2||

instead.

1Between 1996 and 2010 a separate intermunicipal migration data set was released each year (Centraal Bureau

voor de Statistiek, 2005a; Centraal Bureau voor de Statistiek, 2005b; Centraal Bureau voor de Statistiek, 2005c; Centraal Bureau voor de Statistiek, 2005d; Centraal Bureau voor de Statistiek, 2005e; Centraal Bureau voor de Statistiek, 2005f; Centraal Bureau voor de Statistiek, 2005g; Centraal Bureau voor de Statistiek, 2005h; Centraal Bureau voor de Statistiek, 2005i; Centraal Bureau voor de Statistiek, 2006; Centraal Bureau voor de Statistiek, 2007; Centraal Bureau voor de Statistiek, 2008; Centraal Bureau voor de Statistiek, 2009; Centraal Bureau voor de Statistiek, 2010; Centraal Bureau voor de Statistiek, 2011). Intermunicipal migration data after 2010 is all collected in one single data set, which is updated on a yearly basis (Centraal Bureau voor de Statistiek, 2017). All intramunicipal migration data is available in one single data set, also updated on a yearly basis (Centraal Bureau voor de Statistiek, 2018d)

(17)

Intramunicipal migration

Intermunicipal migration Population data Centraal Bureau voor de Statistiek, 2018a

Migration data Centraal Bureau voor de Statistiek, 2018d

Multiple sources, see footnote 1 on page 4

Distance data Estimate:

Distance between the centres of both municipalities Estimate: √ 1 2||

Table 2.1: Sources for the values of the different variables used in the regression. The linear form of the gravity model presented in Equation 2.2 can then be fitted on the data using a Generalised Linear Model (GLM). GLMs are flexible generalisations of linear regres-sions, in which the response variables can have a non-normal error distribution model (Nelder and Wedderburn, 1972). Because logarithms are used in this linear form and it is impossible to take the logarithm of zero, the cases in which no people migrate between two municipalities should be processed before fitting the equation.

There are two options to solve this problem: disregard the migration data when no migrants move between two municipalities in a certain year, or modify all the measured migration data in such a way that all connections have some migrants. Because choosing the first approach would mean that information is lost on municipalities that did not attract migrants, we choose the last option. Every number of migrants is increased by two, as this value minimises the deviation of the model. A regression on this data resulted in Equation 2.3 (χ2(4, N = 4, 750, 471) = 1.4232· 106,

P ≤ 0.001). Ma→b= exp(−1.6175) · p0.2433 a · p0.2327b0.4760 a→b + ϵa→b (2.3)

2.4 Expansion of the gravity model

The impact of the regional identities can be examined by expanding the gravity model with a categorical variable ι that is true if both municipalities have the same regional identity and false if they do not. Following this introduction, the linear version of the gravity model is also adjusted to Equation 2.4. The used parameters are shown in Table 2.2.

(18)

Parameter Description

α Influence of pa on the number of migrants

β Influence of pb on the number of migrants

γ Influence of ∆a→b on the number of migrants

δ = log(G) Normalisation constant of the regression function

ϵa→b Difference between the calculated number of

mi-grants and Ma→b, different for each set of (a, b)

ι = log(I) Increase in the number of migrants when both mu-nicipalities are located in the same identity region ∆a→b Distance between municipality a and municipality b

Ma→b Number of migrants between municipality a and

mu-nicipality b

pa Population of municipality a

pb Population of municipality b

Table 2.2: Different parameters used in the extended gravity model

2.4.1

Specification of regional identities

In order to incorporate regional identities into the model, these should first be specified by form-ing identity regions. To be able to comprehend both the importance of identity in regions of certain sizes and the significance of choosing the right clusters of municipalities, three different sets of identity regions are used as shown in Table 2.3.

The first two sets of identity regions consist of administrative regions, designed to compare regions of certain sizes within the EU. The first set consists of the twelve NUTS 2 regions or provinces (Eurostat, 2013), the second set of forty NUTS 3 regions or COROP regions (Centraal Bureau voor de Statistiek, 2015; Eurostat, 2013). Because of their administrative origin, these regions are easy to get hold of, but might not be fully accurate. Regions within both sets must have a minimal number of residents to allow for accurate statistic comparison (Eurostat, 2018). Because it is not guaranteed that every existing identity region has enough residents, this could mean that different smaller identity regions were combined to reach this population threshold. The third set of identity regions is manually specified through a literature study. It consists of seventy long-standing historical identity regions. Details on these regions are found Ap-pendix B. Even though this specification process is a complex and demanding task, it often also is a necessity. Prespecified administrative regions are not always available.

Data set Regions Specification

NUTS-2 (Provinces) 12 Eurostat, 2013

NUTS-3 (COROP regions) 40 Centraal Bureau voor de Statistiek, 2015; Eurostat, 2013

Literature study 70 Various literature sources, as specified in Appendix B

(19)

2.4.2

Introduction of the different sets of identity regions

Using the same data as before, new models can be fitted on each of the three predefined sets of identity regions. The formula fitted on the NUTS 2 regions is shown in Equation 2.5 (χ2(4, N =

4, 750, 471) = 1.4028· 106, P ≤ 0.001), the formula generated using the NUTS 3 regions in

Equation 2.6 (χ2(4, N = 4, 750, 471) = 1.2729· 106, P ≤ 0.001) and the formula that is based

on the regions defined through literature in Equation 2.7 (χ2(4, N = 4, 750, 471) = 1.2712· 106,

P ≤ 0.001). This means that the deviance is decreased by respectively 1.4%, 10.6% and 10.7%.

In all cases the influence of identity was significant (P < 0.001).

Ma→b= exp(−2.0427) · p0.2452 a · p0.2343b0.3987 a→b · exp(0.3824)[region(a)=region(b)]+ ϵ a→b (2.5) Ma→b= exp(−2.3727) · p0.2491 a · p0.2361b0.3373 a→b · exp(1.2482)[region(a)=region(b)]+ ϵ a→b (2.6) Ma→b= exp(−2.3398) · p0.2499a · p0.2383b0.3493 a→b · exp(1.5360)[region(a)=region(b)]+ ϵ a→b (2.7)

In these three equations two variables have changed by more than one tenth: γ and δ. The δ variable would previously have contained part of the ι variable, and is thus likely to have a lower value when the ι variable is introduced. Likewise, the value of γ is lowered because the variable would no longer have to account for the effects that identity has on shorter distance migration.

2.5 Comparison of the importance of identity in different sets of regions

The ι values found cannot be compared one on one, because the values of the α, β, γ and δ parameters differ as well. This means that we should find another way of comparing different identity regions. By analysing the differences in the values of the ϵ variables of the basic gravity model specified in Equation 2.3, we could do just that.

These ϵ variables could only be analysed to find the minimal influence of identity, as we just argued that the δ and γ already partially compensate for the effects that regional identity has on the human migration network. To be able to determine this minimal influence, the ϵ values are first split into two different categories as specified in Equation 2.8.

ϵ =

{

ϵin, if municipalities in the same identity region

ϵout, if municipalities not in the same identity region

(20)

data. A positive ICM value indicates that the model has more difficulties to explain the intrare-gional migration behaviour than it has to explain the interreintrare-gional migration behaviour. An ICM value can be calculated for each separate municipality. This is done by only using all ϵin and ϵout values for all migration flows originating in that particular municipality. Which

migration flows are part of the intraregional migration figures and which flows are part of inter-regional migration figures depends on the used identity regions.

A comparison between these different ICM values can give an indication as to what regions contain stronger identities, or whether the municipality is part of the right identity region. This does not imply that an ICM value can be translated directly to the influence regional identity has on migration. Instead, the ICM value defines the unexplained differences in the remaining deviance that could not be included in the γ and δ variables.

2.6 Creation of other sets of identity regions

When an ICM value is positive, it cannot directly be concluded that the predefined identity regions have a real influence on the human migration decision. Positive ICM values might also occur in randomly generated regions, just because the municipalities within each region are located in proximity to one another. This possibility can be excluded by further comparing the ICM values of the predefined identity regions with the average ICM values generated by the same number of randomly generated regions.

2.6.1

Randomly generated regions

A set of random regions can be generated by assigning each municipality to a random region, while making sure that every region consists of at least two municipalities. Such a set of regions is expected to have an mean ICM value of zero, because random combinations of municipalities are not likely to hold the same identity. This also means that the ι variable in the extended gravity model would be close to zero.

2.6.2

Randomly generated spatially clustered regions

Using randomly generated regions might however not be realistic. In real life, identity regions are not scattered all over the country, but present in a cluster of municipalities. A more realistic approach to generating random regions would thus enforce that municipalities located within a region should at least form a spatial cluster together.

The k-means algorithm is used to create such spatial clusters of municipalities (MacQueen, 1967). Instead of randomly assigning each municipality to a region, a random centre point is as-signed to each region. Each municipality is then asas-signed to the closest centre point, after which the centre point of each region becomes the geographical centre of the municipalities belonging to that region. Some municipalities might then be located closer to the centre point of another region, which means that the that municipality is relocated to that other region. This process is repeated until no more municipalities are reassigned to another region. A visual representation of this algorithm is shown in Figure 2.2.

(21)

Figure 2.1: Visual representation of the steps taken to assign municipalities to spatially clustered regions using the k-means algorithm. This algorithm can be applied to generate different numbers of regions. This number of regions is controlled by the variable k. To be able to compare the generated regions with a certain set of identity regions, this k is set equal to the number of regions present in this set of identity regions.

When this algorithm is applied the generated regions are likely to partly overlap with the identity regions, because both types of regions are specially clustered. This means that the ICM values become positive, and the ι values larger than they were in the randomly generated regions.

2.7 Optimisation of a set of regions

It could furthermore be interesting to see whether a set of regions can be optimised further. If a certain predefined set of regions cannot be further optimised, or the ICM values of the optimised randomly spatially clustered regions are lower or similar to the ICM value of the predefined region, we would expect that these predefined regions are well defined.

When a set of regions is optimised, we want to increase the average ICM value. When this value is increased, the differences in the predictive value of the model between the interregional and intraregional migration are enlarged. As this difference becomes larger, the strength of the identity contained within the defined regions also becomes larger. Under the constraint that every resulting region has at least two municipalities, we specified an algorithm to increase the ICM values in Algorithm 1. Without this constraint it would become impossible to calculate the ICM value.

(22)

Data: a set of regions, each containing at least two municipalities Result: a set of regions with a better average ICM value than before

initialise current regions filled with municipalities;

do

initialise new regions empty;

for every municipality in the Netherlands do

determine current region;

determine the regions neighbouring the municipality; determine optimal region using Equation 2.10;

if optimal region different than current region and current region will have at least

two municipalities in the new regions and fifty percent chance then

add municipality to the optimal region in the new regions;

else

add municipality to the current region in the new regions;

end end

current regions become the new regions;

while not every municipality in optimal region and these municipalities are not the same

for the last five iterations;

Algorithm 1: Optimisation algorithm used to increase the average ICM value for a set of

regions.

The function that is used to find the optimal region for a certain municipality is shown in Equa-tion 2.10. In this funcEqua-tion the notaEqua-tion Ma→b,yearis used to denote that only migration data for

that year is used in the M function. This allows us to take the median value of the corrected migration data, which means that outlier data will have very little influence on the optimisation process. In this formula the effects of a municipality relocation are evaluated by looking at the change in the ICM values of the municipalities in that region, and the change in the ICM value of the municipality itself.

Because identity regions are usually not scattered all over the country, this optimisation al-gorithm is limited to only assign a municipality to one of the regions that also contains a neigh-bouring municipality.

value for optimal region(a) = max( 1

|{b | ∀ municipality b ∈ region | a ̸= b}·

Σ{med({Ma→b,year|∀year ∈ years}) + med({Mb→a,year|∀year ∈ years})

| ∀ municipality b ∈ region})})

(2.10)

In order to prevent the algorithm from creating a deadlock situation, there is only a fifty percent chance of reassigning a municipality to the determined optimal region, given that there will still be two municipalities left in the region the municipality belonged to. When this probability is not introduced, situations can occur in which two municipalities that should be in the same region can never end up together. Once all municipalities are located in their optimal region or the same set of municipalities is relocated for five iterations, the municipality relocation process is ended.

The resulting region configuration is a local optimum. Because there are many of such local optima, this means that the algorithm will have to be executed several times to find the optimal configuration that can be reached from a particular starting configuration. Because the algo-rithm does not accept changes that lower the ICM value, this does not necessarily mean that the optimal configuration of regions can be reached. As a result, we cannot say for sure that the most optimal local optimum accessible is in fact the global optimum.

(23)

Figure 2.2: A visual representation of the optimisation algorithm. Given a certain starting configuration, it is determined what regions are located within a distance of 20 kilometres from each existing municipality. For each of these municipalities, the change in the global average ICM value is measured when a municipality would be part of that region. Every municipality that should be part of another region than it already was, is then relocated with a chance of 50%. This process is repeated until no more municipalities are relocated for three iterations. When the starting configuration is created through thorough research, we could argue that this resulting configuration could actually be the global optimum. It could after all be considered to be very unlikely that the optimal configuration would differ significantly from a well researched configuration. When more complicated methods are used to actually find the global optimum complications can arise, as further explained in Appendix C.

(24)
(25)

CHAPTER 3

The significance of the influence of the

specified identity regions

The ICM values of the generated random, randomly spatially clustered, optimised randomly spatially clustered and optimised identity regions can be compared with the ICM value of the prespecified identity regions. By comparing these ICM value distributions we can determine whether the effects we attribute to regional identity could not be attributed to chance as well. In the following sections we will compare the distributions of mean an median values for each of the mentioned types of regions. Because the gravity model is dependent on various variables that are estimated from the data, as well as an estimated distance variable, we have to test whether our conclusions will also hold if these parameters are slightly varied. In each case, the α, β, γ and ∆ variables are varied by 10% to support the conclusions we draw on the differences that exist between the ICM values of the randomised spatially clustered regions and the ICM value of the predefined regions.

Besides these distributions of average ICM values, each of these single data sets also has a geographical distribution of ICM values. A closer examination of these ICM values can help in identifying municipalities that are located in the wrong region beforehand, or help understanding the effects of the used algorithms. The geographical distributions of the data sets are included in Appendix D.

(26)

3.1 Differences in the mean ICM values

As seen in Figure 3.1, the ICM value of the NUTS 2 regions lies within the 95% confidence interval of the ICM values of the randomly spatially clustered regions. This means that the NUTS 2 regions cannot be distinguished from the randomly spatially clustered regions. Despite this, Figure 3.2 shows that the ICM value of the NUTS 2 regions remains higher than the average ICM value of the randomly spatially clustered regions, even when parameters change (0.96 standard deviations under default parameters).

Figure 3.1: The mean ICM values for 250 sets of twelve randomly generated regions, 250 sets of twelve randomly generated spatially clustered regions, fifty sets of twelve optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 2 regions, compared to the ICM value of the original NUTS 2 regions.

Figure 3.2: Whatever changes in parameters we make to the human migration model, the mean value of the ICM values of the randomly generated spatially clustered regions is always signifi-cantly lower than the ICM value of the NUTS 2 regions. For each of the parameter configurations, thirty different randomly generated spatially clustered regions were generated.

(27)

As seen in Figure 3.3, the ICM value of the NUTS 3 regions is significantly higher than the ICM values of the randomly spatially clustered regions. This means that the NUTS 3 regions can easily be distinguished from the randomly spatially clustered regions. This large difference is also shown in Figure 3.4. Even when parameters in the model change, the difference between the randomly spatially clustered regions and the predefined NUTS 3 regions is high (9.19 standard deviations under default parameters).

Figure 3.3: The mean ICM values for 250 sets of forty randomly generated regions, 250 sets of forty randomly generated spatially clustered regions, fifty sets of forty optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 3 regions, compared to the ICM value of the original NUTS 3 regions.

(28)

As seen in Figure 3.5, the ICM value of the literature defined regions is also significantly higher than the ICM values of the randomly spatially clustered regions. Though the difference is smaller than it was in the NUTS 3 region comparison, we still find that the there are large differences between the average ICM value of the randomly spatially clustered regions and ICM value of the literature defined regions (2.90 standard deviations under default parameters). Even when parameters in the model change, the differences between the randomly spatially clustered regions and the predefined NUTS 3 regions are high, as shown in Figure 3.6.

Figure 3.5: The mean ICM values for 250 sets of seventy randomly generated regions, 250 sets of seventy randomly generated spatially clustered regions, fifty sets of seventy optimised randomly generated spatially clustered regions and fifty sets of optimised literature regions, compared to the ICM value of the original regions specified through literature.

Figure 3.6: Whatever changes in parameters we make to the human migration model, the mean value of the ICM values of the randomly generated spatially clustered regions is always signif-icantly lower than the ICM value of the literature defined regions. For each of the parameter configurations, thirty different randomly generated spatially clustered regions were generated.

(29)

3.2 Differences in the median ICM values

Whereas the mean ICM value of the NUTS 2 distribution was located within the distribution of the mean ICM values of the randomly generated spatially clustered regions, the median ICM value of the distribution is lower than all the median ICM values of the randomly generated spatially clustered regions. A comparison of these median values is shown in Figure 3.7. This figure also shows that the median values of the optimised NUTS 2 regions are generally lower than the median values of the optimised randomly generated spatially clustered regions. On the other hand, the median values of the NUTS 3 distributions and literature defined regions are located within the distribution of median ICM values of the randomly generated spatially clustered regions. As can be seen in Figures 3.8 and 3.9, the median ICM values of both optimised distributions are indistinguishable as well.

Figure 3.7: The median ICM values for 250 sets of twelve randomly generated regions, 250 sets of twelve randomly generated spatially clustered regions, fifty sets of twelve optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 2 regions, compared to the ICM value of the original NUTS 2 regions.

(30)

Figure 3.8: The median ICM values for 250 sets of forty randomly generated regions, 250 sets of forty randomly generated spatially clustered regions, fifty sets of forty optimised randomly generated spatially clustered regions and fifty sets of optimised NUTS 3 regions, compared to the ICM value of the original NUTS 3 regions.

Figure 3.9: The median ICM values for 250 sets of seventy randomly generated regions, 250 sets of seventy randomly generated spatially clustered regions, fifty sets of seventy optimised randomly generated spatially clustered regions and fifty sets of optimised literature regions, compared to the ICM value of the original regions specified through literature.

(31)

3.3 Sensitivity to parameter changes in the optimisation algorithm

Within the optimisation algorithm one parameter is used, which is arbitrarily chosen: the dis-tance parameter. In the design of this algorithm, the decision was made that every municipality could only be relocated to a region that was located within a twenty kilometre distance of that municipality. It can be argued that this distance cut-off makes sense, because the maximum lowest distance between two municipalities is larger than ten kilometres but lower than twenty kilometres. It would however be even better to test the differences in the outcomes of the op-timisation algorithm when different distance cut-offs are used. We therefore test whether the difference between the ICM value distributions of the optimised randomly spatially clustered regions and the ICM value distributions of the optimised predefined regions differs significantly when this distance cut-off is changed to either ten or thirty kilometres.

Figure 3.10: Differences between the mean ICM values of the optimised randomly spatially clustered regions and the mean ICM values of the optimised predefined regions for each predefined set of regions, using three different distance cut-offs in the optimisation algorithm. For each of the parameter configurations, thirty different optimised regions were generated.

As the cut-off distance in the optimisation algorithm changes, we see in Figure 3.10 that this has different effects on the differences between the mean ICM values of the optimised randomly spatially clustered regions and the mean ICM values of the optimised predefined regions for

(32)

gions. As the cut-off distance increased, the difference between the optimised literature regions and the randomly generated spatially clustered regions became worse. Considering that the median ICM value of the literature regions was not better when we started the algorithm, this can likely only be explained by the fact that most ”optimal changes” for the literature regions are located within a ten kilometre radius, while this is not the case for the randomly gener-ated spatially clustered regions. Even though we cannot really reach a real definite conclusion or interpretation of the NUTS 3 data, we see the same pattern emerge in the set of NUTS 2 regions.

Figure 3.11: Differences between the median ICM values of median ICM values of the optimised randomly spatially clustered regions and the optimised predefined regions for each predefined set of regions, using three different distance cut-offs in the optimisation algorithm. For each of the parameter configurations, thirty different optimised regions were generated.

(33)

CHAPTER 4

Discussion

The initial expansion of the gravity model with three different sets of identity regions gave us three different equations that showed how the influence of these identity regions should be in-corporated into the model. As shown in Equation 2.5, individuals were 1.46 times more likely to move towards a location if people in that location live in the same NUTS 2 region. The introduction of these NUTS 2 regions to the gravity model decreased the deviance of the model by 1.4%.

When the NUTS 3 and literature defined regions are added, a much larger effect is seen. As seen in Equations 2.6 and 2.7, people are respectively 3.48 and 4.65 times more likely to move to a certain location when taking these regions into consideration. After the introduction of these re-gions to the gravity model, the deviance of the model respectively decreased by 10.6% and 10.7%. It thus seems that the identity regions should be small to be able to actually contribute to better predictions. This idea is further supported by the mean ICM value comparison graphs shown in Figures 3.1, 3.3 and 3.5. Whereas the average ICM values of the NUTS 2 regions could not be distinguished from the randomly spatially clustered regions, the average ICM values of the smaller NUTS 3 and literature defined regions could. These conclusions stayed the same when the different parameters of the model were altered by 10%.

Even though the maximal ICM value of all three sets of optimised identity regions is higher than the maximal ICM value of their optimised randomly generated spatially clustered counter-parts, we find that the distributions are sensitive to changes in the distance cut-off used in the optimisation algorithm. In Figure 3.10 we see that only the mean value of the optimised NUTS 3 regions is always higher. When the median values are compared, we also find that the NUTS 2 regions perform worst: the median ICM value is, regardless of the cut-off distance chosen, lower

(34)

number of people that migrate over smaller distances, because part of these migration numbers are also increased by the i parameter when both locations are located in the same identity region. The of the γ parameter is more important in longer distance migration: as this parameter is lowered, the number of migration movements over longer distances drops. Comparing the γ parameters of the basic gravity model (0.4760) and the gravity model extended with NUTS 3 identity regions (0.3373), we find that the model estimates for the number of migrants over a distance of 100 kilometre are seven times higher in the original model than in the model with the NUTS 3 identity regions. This means that we should accept H2, as the introduction of regional identities seems to have a great impact on the way that distance is handled in the model. To be able to reach these conclusions, we had to make several decisions about the way the model, ICM-value and identity regions are specified. In the following sections we will discuss those decisions, and analyse the quality of the used identity regions.

4.1 Specification of the model

The migration data could have been explained using either a gravity or a radiation model. Because it would be easier to understand the influence of identity on the way distance is incor-porated in the model when this distance is explicitly included, we chose to use a gravity model. A radiation model would make this analysis more complicated than necessary.

In this gravity model the distances between two municipalities are approximated by taking the geographical distance between the centres of these municipalities. This is a assumption that works for the Netherlands, because it’s flat and there are a lot of roads and bridges - except maybe for two municipalities that are located at opposite shores of the IJsselmeer. When travel times do not necessarily correlate with the geographical distance, it might be better to find the actual travel times between two locations instead - taking into account the location of the pop-ulation centres within those municipalities as well.

The population parameters in the model were fitted using the intermunicipal and intramunicipal migration data that is available from Statistics Netherlands. In some cases no-one had migrated between two certain municipalities in a certain year. This data could not be excluded, as that would mean a lot of longer distance migration data could not be used in the regression. To solve this problem we have added two migrants to every migration flow, as this minimised the deviance of the original model.

4.2 Specification of the ICM value

The introduction of a set of identity regions to a migration model does not only create a ι pa-rameter for the model, but also has an influence on the other existing papa-rameters. Because the other parameters are not the same for each set of identity regions, the ι parameters cannot be compared directly. We have introduced the ICM value to solve this problem. For each munici-pality, this value represents the ratio between the unexplained variance of the intraregional and interregional migration after applying the basic model.

While this ICM value makes the different distributions and configurations comparable, it can be confusing that the ICM value does not directly match the influence of identity on migration itself. Only the ι parameter in the extended model can be used to determine that influence.

4.2.1

Other influences that can contribute to the ICM value

Even though the ICM value was designed to be able to compare the effects of the identity regions, it could be that other factors also contribute to the differences in the remaining variance. These factors would attract relatively more migrants on a regional scale, than on a interregional scale.

(35)

Pull factors that have an equal effect on people living with in the identity region and outside of the identity region, should have no effect on the ICM value. We would expect that most amenities such as industrial areas, beautiful nature, and housing have such equal effects. Most of the influences that increase the ICM value will be part of the regional identity itself. An important factor in migration that might also increase the ICM values of municipalities, is the economic motivation behind migration. When the industry in a region is highly specialised, we might find that people are more likely to move within their region than outside the region because they want to keep working in the same industry.

The presence of industries could also have the opposed effect: when employment possibilities in a certain identity region are very low, people could decide to migrate towards another region where they can get a job. In such cases a lot of people can move out of the identity region, thus decreasing the ICM values. A similar situation occurs when there are only few or no universities available in a region. As education possibilities in the identity region are very low, people could decide to move somewhere they can get the education they want.

When people are forced to migrate, this can have a negative effect on the ICM values of re-gions. Good examples of forced migration are hard to come by - it does not happen very often. But when hundreds of newly registered asylum seekers are relocated to another immigration centre or factories are relocated and employees have to move, the effects will be hard to ignore. Even though the ICM value is not only influenced by the effects of identity itself, we expect that the positive and negative effects created by other factors are very limited and not very prevalent. We therefore do not expect these factors to have a high influence on the average ICM value of all regions.

4.3 Identity regions

In this research we used three different sets of identity regions. Two of these sets are specified and used by the government to analyse the developments of regions, and one set was specified through literature research.

There are clear benefits in using hard-bounded identity regions to incorporate these identities in a model. The regions can easily be specified through extensive literature research, and can be applied almost everywhere. After discussing the quality of the used identity regions, we will also discuss the challenges in using these identity regions, and suggest other approaches that can be used to include regional identity in the model. The main problem with these approaches, is that they often require more data to be applied: data that is not always present.

(36)

This would imply that the ICM values of these optimised prespecified regions could be very near the global optimal average ICM value, and that the prespecified regions are a better starting position to find an optimal configuration than an optimised randomly spatially clustered region. In Figure 3.3 we also notice something strange: the optimisation algorithm actually decreased the average ICM value of the NUTS 3 regions. Because we use a local optimisation strategy to increase the average ICM value, the optimisation algorithm increased the median ICM value instead, as shown in Figure 3.8. While this shows that the algorithm does work on a local scale, the algorithm might be improved further to work better on a global scale.

4.3.2

Quality

It is very difficult to determine the exact quality of the used predefined regions. Quality can after all mean different things in different contexts. It could be more useful to describe the quality of these regions instead.

The lack of quality of the predefined regions lies mostly within the fact that several munici-palities can be relocated in such a way that their negative ICM values become positive. It seems that the allocations of these municipalities are obvious flaws. Apart from that, other munici-palities that do have positive ICM values can be relocated to increase the average ICM value even further. Comparisons between the predefined regions and optimised regions are shown in Figures 3.1, 3.3 and 3.5.

Whether the fact that the predefined regions can still be optimised is a lack of quality could be a topic of debate. After all, all the predefined regions have higher average ICM values than the randomly generated spatially clustered regions. It could therefore be argued that these sets of predefined regions are defined in a quite unique way, as most configurations with similar regions have a much lower ICM value.

The fact that the average ICM value of the optimised randomly generated spatially clustered regions could in some cases become higher than the original average ICM value of the corre-sponding set of regions could matter. It would mean that it would be very easy to define regions that perform better than the regions created beforehand. This is the case with the NUTS 2 regions as shown in Figures ?? and 3.10. This supports the earlier presented evidence that the NUTS 2 regions might not be the ideal regions to use when incorporate regional identities in a human migration model.

4.3.3

Challenges in the usage of identity regions

The most important challenge in the usage of these identity regions, is that in practice, identity regions are usually not hard bounded. Some municipalities could be part of multiple identities, and in other municipalities there might be small minority groups of people that hold another identity. A simple approach to this problem would be to introduce fuzzy boundaries. In this case, municipalities that are located close to a certain region are assumed to house people that also have the same regional identity as people living in that region.

A more sophisticated approach to this problem would be to research the social connectivity between municipalities. By doing this, all different identities that make up a certain munic-ipality could accurately be represented. This would create an identity network as shown in Figure 4.1. It could however be difficult to use this approach, because detailed data on every individuals social connections has to be acquired. Such data is often hard to acquire or not available at all. As result, this way of incorporating regional identities is not applicable in most situations.

(37)

Figure 4.1: Three different approaches to defining identity region: with hard boundaries, fuzzy boundaries, or by looking at the existing connections between two municipalities.

Another challenge with using a binary approach to regional identity arises when those regions are embedded into a model. The number of migrants is multiplied by a constant factor when two municipalities are located in the same region, but the strength of a regional identity could actually differ in the different identity regions. People could either value their common identity differently, or might hold multiple identities at once. Unfortunately these differences are difficult to estimate beforehand.

Neither is it very clear what many identity regions should be used in a model. Finding out how many regions should actually be present in the data set can be a challenge as well. There will not always be one right answer: related regional identities could be combined or split, as ev-idenced by the relatively small differences in ICM values between the set of NUTS 3 regions and the regions defined through literature research. As long as the used regions are well researched, they can be used.

(38)
(39)

CHAPTER 5

Conclusions

All in all, we have extended a basic gravity model for human migration with the notion of iden-tity regions. This model was fitted using all Dutch intramunicipal and intermunicipal migration data between 1996 and 2016. The three different sets of identity regions used in this research all had a significant effect on the model, and decreased the deviance. When the NUTS 3 and literature defined identity regions were added to the model, people seemed to be respectively 3.48 and 4.65 times more likely to move to a certain location when it was located within the same identity region.

By choosing to model the identity regions with strict boundaries, we created a way of incor-porating identity regions that can easily be used in other models as well. For each model it should be possible to create a set of identity regions through literature research. This means that the influence of regional identity on the human migration behaviour can no longer be ne-glected, and should be included in future human migration models.

We have also shown that it would be better to use smaller sized region whenever that is possible. In this research the larger NUTS 2 regions could not decrease the deviance of the model very well (1.4% versus 10.6% and 10.7%), nor were the effects caused by the regions distinguishable from the effects caused by the same number of randomly generated spatially clustered regions. When the identity regions were added to the gravity model, this had an effect on the way distance was handled. Instead of dividing the potential number of migrants by distance0.4760,

the models using the NUTS 3 and literature defined identity regions divided the potential num-ber of migrants by distance0.3373 and distance0.3493. This is a very interesting find. Because a

lot of municipalities that are located within proximity of one another are located within the same identity region, the predicted migration numbers between those municipalities are often increased

(40)
(41)

CHAPTER 6

Future work

Because regional identity seems to be an important factor in the migration decision, it would be interesting to expand this research to radiation models as well, or add in more variables that are already used in other models.

It could also be interesting to look at the influence of regional identity in different circumstances, as we only looked at the influence of regional identity on recent internal Dutch migration figures. It would be interesting to see what would happen in more segregated societies, or in societies where most people travel a lot.

Additionally, it could also be interesting to look at the way the influence of identity has evolved over time. By looking at internal migration data over a hundred year time period, changed could be detected that could then be mapped to changes that occurred in a certain society. This might help in understanding those regional identities a bit better.

Apart from these different applications, the current application could be further improved by using median or quantile values in the ICM function, instead of medium values. In this particular case, this was not possible because the number of data points was too small for certain regions, and there were too many zero-migration data points in the data set. When larger regions are used over larger time periods this additional research might increase the predictive value of the ICM value.

This does not mean that this research cannot be improved upon. The optimisation algorithm did not always fully work as intended, and could thus be bettered. In the current situation, the average ICM value of the literature defined regions dropped when the algorithm was applied. Furthermore, it might be interesting to adjust the algorithm to allow parts of regions to move

(42)
(43)

Bibliography

van der Aa, A.J. (1839). Aardrijkskundig woordenboek der Nederlanden: eerste deel, A. Gor-inchem: Jacobus Noorduyn.

— (1840a). Aardrijkskundig woordenboek der Nederlanden: derde deel, C en D. Gorinchem: Ja-cobus Noorduyn.

— (1840b). Aardrijkskundig woordenboek der Nederlanden: tweede deel, B. Gorinchem: Jacobus Noorduyn.

— (1843). Aardrijkskundig woordenboek der Nederlanden: vierde deel, E-G. Gorinchem: Jacobus Noorduyn.

— (1844). Aardrijkskundig woordenboek der Nederlanden: vijfde deel, H. Gorinchem: Jacobus Noorduyn.

— (1845). Aardrijkskundig woordenboek der Nederlanden: zesde deel, I-K. Gorinchem: Jacobus Noorduyn.

— (1846a). Aardrijkskundig woordenboek der Nederlanden: achtste deel, N. O. Gorinchem: Ja-cobus Noorduyn.

— (1846b). Aardrijkskundig woordenboek der Nederlanden: negende deel. Gorinchem: Jacobus Noorduyn en zoon.

— (1846c). Aardrijkskundig woordenboek der Nederlanden: zevende deel, L-M. Gorinchem: Ja-cobus Noorduyn.

— (1847a). Aardrijkskundig woordenboek der Nederlanden: tiende deel, S. Gorinchem: Jacobus Noorduyn en zoon.

— (1847b). Beschrijving van den Krimpenerwaard en Lopikerwaard. Schoonhoven: S. E. van Nooten.

— (1848a). Aardrijkskundig woordenboek der Nederlanden: elfde deel. Gorinchem: Jacobus No-orduyn en zoon.

— (1848b). Aardrijkskundig woordenboek der Nederlanden: twaalfde deel. Gorinchem: Jacobus Noorduyn en zoon.

— (1851). Aardrijkskundig woordenboek der Nederlanden: dertiende en laatste deel, Z en

Referenties

GERELATEERDE DOCUMENTEN

Uit deze analyse komt naar voren dat extreem lage afvoeren in de zomer, hier gedefinieerd als afvoeren waarbij watertekort in de Maas optreedt (zie ook afbeelding 1), met

Maatschappelijke acceptatie en een hoge ruimtedruk, die noodzaakt tot verdere intensivering, zijn twee leidende items voor de toekomstgerich- te duurzame beschermde

De koppeling van de gesimuleerde RWS-barrier bestaat uit twee koppel- platen van 6 mm dikte met aan één zijde, in lengterichting van het element, slobgaten van 18x23 mmo

- Spoor 63 is een kuil, ovaal van vorm met een lichtgrijze kleur, als inclusies konden een matige hoeveelheid houtskool en een beetje ijzerconcreties opgemerkt

Onder directe aansturing verstaan we dat de medisch specialist direct opdracht geeft voor de verpleegkundige handelingen, daarvoor aanwijzingen geeft, waarbij het toezicht en

De ACP concludeert dat, gezien de slechte methodologische kwaliteit van de door de fabrikant aangeleverde gegevens en de huidige zeer hoge prijsstelling, het niet mogelijk is aan

Zorginstituut Nederland Pakket Datum 23 maart 2016 Onze referentie ACP 60-2 11 Kosteneffectiviteit (1). Model Alexion methodologisch onvoldoende

Grafiek 18 Verandering in absolute aantallen van het functioneren van ouderen (65 jaar en ouder) in stadsdeel Heerlen Stad in de periode 2014-2030 op basis van