Comparison of Convergence between EU and non-EU Countries

(1)

Comparison of Convergence between EU and non-EU Countries

Wouter Vink

11774517

Date of final version: July 14, 2021 Master’s programme: Econometrics Specialisation: Financial Econometrics Supervisor: Art¯uras Juodis

Second reader: Bart Keijsers

This research compares the convergence of the log GDP per capita of the countries in the European Union (EU) and the European countries outside the EU, using a panel dataset consisting of 44 countries in the years 1994-2018. We observe β−convergence using generalized methods of moments and fixed effects models.

β−convergence is unconditionally stronger in the non-EU countries but conditionally in the EU countries.

We observe σ−convergence by the decrease of the cross-sectional standard deviation and coefficient of variation over time. σ−convergence is stronger in non-EU countries but only if all Schengen countries are treated as EU countries. The European countries can be divided into five convergence clubs, where the EU countries are often in the richest convergence clubs and the non-EU countries in the poorest convergence clubs.

(2)

Statement of Originality

This document is written by Wouter Vink who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

CONTENTS

1 Introduction 5

2 Literature review 7

2.1 The European Union . . . 7

2.2 Former Research . . . 8

2.2.1 Convergence clubs . . . 9

2.2.2 β−convergence . . . 11

2.2.3 σ−convergence . . . 17

2.3 Research questions . . . 18

3 Methodology 19 3.1 Data . . . 20

3.2 Spatial effects . . . 22

3.3 Convergence . . . 24

3.3.1 GMM estimation . . . 24

3.3.2 FE estimation . . . 26

4 Results 27 4.1 Convergence clubs . . . 30

4.1.1 Creating the convergence clubs . . . 30

4.1.2 Sensitivity of the convergence clubs . . . 32

4.1.3 Explanation of the convergence clubs . . . 36

4.2 β−convergence . . . 38

4.2.1 GMM estimation . . . 38

4.2.2 FE estimation . . . 40

4.3 σ−convergence . . . 43

(4)

5 Conclusion 45

References 48

Appendix 51

A.1. Article overview . . . 51

A.2. Mathematical properties of the log t test . . . 52

A.3. Convergence club algorithm . . . 54

A.4. Variable Description . . . 57

A.5. Summary statistics . . . 58

A.6. Data preparation . . . 60

A.7. The great circle distance . . . 61

A.8. Moran’s I statistic . . . 62

A.9. Convergence clubs sensitivity . . . 64

(5)

1. INTRODUCTION

The differences in the log GDP per capita of the countries in the European Union (EU) reduced since the start of the EU. For example, Bulgaria was the poorest country in 1994 with a log GDP per capita of 8.33, and Luxembourg was the richest country with a log GDP per capita of 11.22, which is 35 per cent higher. In 2018, Bulgaria was still the poorest country with a log GDP per capita of 9.07, and Luxembourg was still the richest country with a log GDP per capita of 11.61, which is only 28 per cent higher.

In the context of country economies, convergence is defined as the process where the differences in the financial structures of countries decrease over time. Convergence is observed if relatively rich countries develop less than relatively poor countries, such that the disparities between the countries decrease (Monfort, 2008). The most often used measure of a countries financial structure is the development of the log GDP per capita.

European integration started after the Second World War and has led to the EU in 1993 after the Maastricht Treaty was signed. An increasing number of countries participate in this European cooperation, which currently has 27 member states. An essential point of the EU is that there are no trade and tariff barriers between the member states, such that free movements of goods, services, capital, and labor exist. Another aspect of the EU is the replacement of the national currencies by the euro, and the European Central bank, which is responsible for the monetary policy of the EU countries (Neal, 2007). The EU cohesion fund financially supports the poorest EU countries to reduce the disparities in the EU (Monfort, 2008). These EU policies suggest that the economies of the EU member states should converge to each other.

Much research exists on the convergence of countries and regions in the EU. This research contributes to the existing literature by comparing the convergence of European countries in and outside the EU. In this way, we can compare the EU and the non-EU countries to see whether the EU policies could affect the convergence of its member states. Besides, countries can see how EU membership potentially influences their economy, which can be considered when countries may want to join the EU.

(6)

We investigate whether some groups of countries converge to the same steady-state using the convergence club algorithm, proposed by Phillips & Sul (2007). We also determine which variables explain the convergence using an ordered logistic model. We measure convergence using generalized methods of moments (GMM) with both one-step and two-step methods. We also estimate fixed effects (FE) models, where we use the split panel jackknife (SPJ) bias-correction.

We observe β−convergence if the previous log GDP per capita value negatively affects the log GDP per capita growth, such that relatively rich countries develop less compared to relatively poor countries. We observe σ−convergence if the cross-sectional standard deviation and coefficient of variation decrease over time, such that the disparities between countries reduce. We use yearly data in the period 1994-2018 of 44 European countries to investigate the financial development of the log GDP per capita of the European countries after the start of the EU.

Our results show that five convergence clubs exist, with Ukraine as the only diverging country. EU and non-EU countries almost form their own convergence clubs when we treat all Schengen countries as EU members. The GMM and FE models show similar results, where unconditional β−convergence is stronger in non-EU countries, but conditional β−convergence is stronger in EU countries when we look at point estimates. However, the 95 per cent confidence intervals of the EU and non-EU countries overlap, such that the differences in these samples are not observed with high certainty. We observe comparable σ−convergence in the EU and non-EU countries, but it is stronger in the non-EU countries if we treat all Schengen countries as EU members.

The remainder of this paper is structured in the following way. Section 2 states a theoretical background and a summary of the methods and results of previous literature. Section 3 provides an overview of the data and a description of the methodology. Section 4 contains a discussion of the results, using several figures and tables. Section 5 concludes.

The Appendix contains an overview of the most important results of the papers discussed in the literature review. We provide information about the data we use in this research, and we give additional information about the creation of the spatial effects variable. Moreover, some mathematical properties of the log t test and a description of the convergence club algorithm, including

(7)

sensitivity analysis tables, can be found in the Appendix.

2. LITERATURE REVIEW

In this section, we discuss some theoretical concepts and findings of former research. First, we discuss the most important policies of the EU and how they possibly lead to convergence.

Second, we define the different concepts of convergence, and we summarize the methods and results of previous research for each of these concepts. This section concludes with formulating the research questions of this paper. An overview of the discussed articles with their main results can be found in Appendix A.1.

2.1. The European Union

European integration began with common markets in coal and steel in 1951. More integration between European countries has led to the European Economic Community in 1957, the European Community in 1986, and the European Union (EU) in 1993 after the Maastricht treaty was signed. Only six countries were part of the European cooperation in 1951, but many countries joined in the last decades. The EU currently has 27 member states. In this subsection, we discuss the most important aspects of financial integration in the EU, which may explain why the economies of the EU countries converge to each other.

One of the most important aspects is that the EU is a customs union. Free trade of goods, services, labor, and capital is possible within the EU countries. Especially less developed countries can benefit from the customs union. These countries can reach a better economic state by skipping intermediate developments, called leapfrogging, and directly use the methods of the most developed countries in the EU. Leapfrogging happened, for example, in the agricultural sector by promoting technical progress and by optimizing production factors (Neal, 2007).

There are no trade and tariff barriers between the member states, which leads to trade creation. Welfare increases by shifting the production to more productive countries and by a larger amount of available goods. Especially the countries with many unproductive sectors can benefit a lot from this. However, there is also a loss of tariff revenues, which is called trade diversion. Neal

(8)

(2007) showed that the positive effects of trade creation are much larger than the negative effects of trade diversion in the EU.

Another aspect of the EU is that most members replaced their national currency with the Euro. Moreover, the European Central Bank (ECB) became responsible for the monetary policy of these countries. The main goal of the ECB is price stability. The ECB influences the economic growth of the member states by adjusting the lending and borrowing rates. As a result of the ECB policy, inflation and economic growth are stabilized and converging among the member countries (Neal, 2007).

An increasing part of the EU budget is issued on regional policy programs. One of those policies is the EU treaty, which is an economic and social cohesion to reduce disparities between the member states. The cohesion funds only go to the four poorest EU countries, which are Greece, Ireland, Portugal, and Spain (Ramajo et al., 2007). The contributing countries are willing to con- tribute to this policy if the receiving countries accept some conditions for spending the extra funds (Neal, 2007).

Cohesion is one of the priorities of the European Union (Monfort, 2008). The European Cohesion Policy can be summarized by article 2 and article 158 of the Treaty establishing the European Community. Article 2 states: ’Promote economic and social progress, as well as a high level of employment, and achieve balanced and sustainable development’. Article 158 states: ’In particular, the Community aims to reduce the disparities between the levels of development of the different regions and the backwardness of the least favored regions or islands, including rural areas’. These articles indicate that the EU aims for growth-promoting circumstances, especially for relatively poor countries.

2.2. Former Research

Article 158 implies that the target of the EU cohesion policy is to reach σ−convergence.

Some researchers, for example, Quah (1993) emphasize that convergence is about the dispersion of cross-sectional data, which makes analysis based on β−convergence limited. Convergence should be measured directly by the dynamics of income levels across countries. A coefficient in a re-

(9)

gression model for β−convergence that indicates convergence does not necessarily imply that the dispersion decreases. However, researchers are also interested in β−convergence, as it measures which variables play an essential role in the growth of the log GDP per capita, as σ−convergence only indicates convergence using the financial measure itself (Islam, 2003). Ram (2018) argues that although the concept of σ−convergence is more basic, the comparison of both measures is helpful, as they are different in construction, units of measurements, and mathematical properties.

In this subsection, we review the most important methods and results of both forms of convergence discussed in former papers. We explain the convergence club literature as well.

We discuss research based on both country and regional data. Literature often uses the Nomenclature of territorial units for statistics (NUTS), which is a hierarchical system to divide the economic territory of the EU. The NUTS2 regions are most commonly used, where each region has between 800,000 and 3 million inhabitants.

2.2.1. Convergence clubs

Convergence tests can be used to show whether all countries converge to the same steady- state or whether some groups of countries exist that converge to the same steady-state. The log t test developed by Phillips & Sul (2007, p. 1788-1789) can be used to test the overall convergence of the data. For this test, we first need to calculate the cross-sectional variance ratioH₁/Ht, where Ht can be written as

Ht= _N¹ _∑^N_i₌₁(h_i,t−1)², h_i,t= ₁ ^GDP^i,t

N∑^N_j=1GDP_j,t, (1)

where GDP_i,t is the log GDP per capita of country i in year t, and i =1, ..., N and t=1, ..., T correspond to all countries and years included in the model. This quantity can be used to perform the linear regression

log(^H_H¹

t) −2log(log(t+1)) = ˆa+ˆb∗log(t) +uˆt, (2)

(10)

fort= [0.3T],[0.3T] +1, ..., T.

The t-statistic t_ˆb of the coefficient ˆb in equation (2) must be calculated, where the heteroskedasticity and autocorrelation consistent (HAC) variance-covariance estimator is used. The varianceΩ of the HAC estimator can be estimated non-parametrically by:

Ωˆ =_∑^T_j₌₋⁻¹_T₊₁k(_B^j)_Γ^ˆ(j), Γˆ(j) =











1

T∑^T_t=⁻1^jxt+juˆt+juˆ⁰_t₊_jx⁰_t₊_j forj≥0,

1

T∑^Tt=−j+1x_t+juˆ_t+juˆ⁰_t₊_jx⁰_t₊_j forj<0

(3)

where k(_B^j) is a kernel function with bandwidth B, and ˆΓ(j) is the estimated sample covariance withxt =log(t), equal to the explanatory variable of the log t test (Phillips et al., 2007).

The null hypothesis of the log t test, which states convergence, is rejected at 5 per cent significance level ift_ˆb ≤ −1.65. Some mathematical properties about this convergence test can be found in Appendix A.2. Bartkowska & Riedl (2012) rejects the null hypothesis for the European NUTS2 regions for the period 1990-2002, and Lyncker & Thoennessen (2017) rejects for the NUTS2 regions of the EU-15 countries for the period 1980-2011, which indicates that European regions do not converge to the same steady-state. In this research, we test the null hypothesis for European country data instead of regional data.

If convergence is rejected for the whole sample, the log t test can be performed on subsets of the data to determine whether subgroups exist that converge to the same steady-state. These subgroups can be formed in a 4-step algorithm, which is described by Phillips & Sul (2007, p. 1800- 1801), and extended by Lyncker & Thoennessen (2017). These algorithms can be found in Ap- pendix A.3.

Bartkowska & Riedl (2012) do not use the extension of Lyncker & Thoennessen (2017) but observe six converging subgroups and three diverging regions. They find that regions in the same country tend to cluster together, but the region where the capital is located often belongs to a higher convergence club than the neighboring regions. Regions belonging to the same convergence club often cluster together, which indicates that spatial effects play a role. The converging policy

(11)

of the EU is partly visible, as some initially poor regions that receive funds end up in the highest convergence club, but some countries that receive cohesion funds are still in a low convergence club. These effects are also visible in the research of Lyncker & Thoennessen (2017). They observe four convergence clubs and one diverging region and find that the Northern and Southern European Regions are often in a rich and poor convergence club, respectively.

The convergence clubs can be seen as an ordinal variable, as they can be ordered in the log GDP per capita where each convergence club converges to. Bartkowska & Riedl (2012) and Lyncker & Thoennessen (2017) use an ordered logistic model, proposed by (McKelvey & Zavoina, 1975), to find which variables can explain why the regions are in a particular convergence club.

Both papers show similar results, where mainly initial conditions determine in which convergence club the regions end up. Using marginal effects of the ordered logistic model, they find how the probability of ending up in a particular convergence club changes if the value of one of the variables changes. Lyncker & Thoennessen (2017) find, for example, that a higher initial human capital leads to a higher probability of ending up in the highest convergence club, but a higher initial physical capital leads to a lower probability of ending up in the highest convergence club.

2.2.2. β−convergence

β-convergence refers to the process where countries that are very developed at an initial point in time develop less compared to countries that are less developed at an initial point in time, such that the differences between the countries reduce. We find unconditional β-convergence if all economies converge to the same steady-state, and we observe conditional β-convergence if the economies converge to different steady-states, which depends on the level of some explanatory variables (Monfort, 2008).

We observe β−convergence if the log GDP per capita growth negatively depends on the previous value of the log GDP per capita, such that the increase is larger in countries with a low log GDP per capita compared to countries with a high log GDP per capita (Kilinc et al., 2017). Most literature uses the generalized method of moments (GMM) to estimate models for β−convergence, but fixed effects (FE) estimation is also a popular method. In this subsection, we summarize the

(12)

literature of both methods.

β−convergence can be measured by a simple regression of the current value of the log GDP per capita on the previous value (Kilinc et al., 2017). However, Bond et al. (2001) argue that OLS and Within-Group estimators are respectively upwards and downwards biased in cross- country growth regressions and that system GMM is unbiased and consistent. Therefore, estimating β−convergence based on system GMM is one of the preferred methods in recent literature.

Bouayad-Agha & Verdrine (2010) argue that the development of a country does not only depend on its own values but also on the development of neighboring countries. Trade relations, factor mobility, and geographical spillovers are essential factors of economic growth. Arbia et al.

(2008) find evidence for model misspecification if spatial effects are not taken into account in growth models.

Barro (2015) argues that empirical findings on convergence depend on country-specific fixed effects, which are time-independent. In this way, the model allows for unobserved country characteristics that influence the log GDP per capita. These effects likely exist in convergence models, as relatively rich countries often have more favorable characteristics than relatively poor countries. For example, high-quality education can lead to a higher steady-state of the log GDP per capita (Barro, 2015). They argue that the exclusion of country-specific effects tends to bias upwards the effect of the previous log GDP per capita on the log GDP per capita growth, such that weaker convergence is observed.

Bouayad-Agha & Verdrine (2010) propose a GMM approach with spatial and fixed effects based on the equation

y_i,t =αy_i,t−1+ρ∑^Nj=1w_i,jy_j,t+βx_i,t+γi+e_i,t, (4)

wherey_i,t is the log GDP per capita and x_i,t contains control variables of country i at time t. w_i,j is element ij of the [NxN] spatial matrix W, which contains the weights of the spatial effects between countries i an j. The fixed effects γ_i play no role in the GMM model, as the moment

(13)

conditions are based on the first differences of model (4).

Arellano & Bond (1991) propose moment conditions based on the dependent variable. They use that the error terms are independent of the dependent variable in earlier periods. This makes

∆e_i,t uncorrelated withy_i,t−2, and all earlier observations of the dependent variable of country i.

Therefore, they use the moment conditions

E(y_i,s∆e_i,t) =0 for s=1, ..., t−_{2 and t}=3, ..., T. (5)

Bond (2002) states that moment conditions based on the explanatory variables depend on the as- sumption of their correlation with the error terms.

E(x_i,s∆e_i,t) =_{0 for s}=1, ..., T and t=3, ..., T, if x_i,t is strictly exogenous.

E(x_i,s∆e_i,t) =0 for s=1, ..., t−1 and t=3, ..., T, if x_i,t is weakly exogenous. (6) E(x_i,s∆ei,t) =0 for s=1, ..., t−2 and t=3, ..., T, if x_i,t is endogenous.

Bouayad-Agha & Verdrine (2010) mention that as a result of the spatial term in equation (4), more moment conditions are needed for the GMM estimator to be unbiased and consistent. They argue that the term ∑^Nj=1w_i,jy_j,t can be seen as an endogenous explanatory variable, which leads to the additional moment conditions

E(_∑^N_j₌₁w_i,jy_j,s∆e_i,t) =0 for s=1, ..., t−2 and t=3, ..., T. (7)

Combining the moment conditions (5), (6), and (7) for countryi in matrix Zi, we get E(Z_i⁰∆ei) =0, where∆ei= (_∆e_i,3,∆e_i,4, ...,∆e_i,T)⁰. We can define the GMM estimator as

(ˆα, ˆρ, ˆβ) =arg min

(α,ρ,β)

g⁰_NCNgN, (8)

(14)

wheregN = _N¹ _∑_i^N₌₁Z_i⁰(_∆y_i,t−_α∆y_i,t₋₁−ρ∆ ∑^Nj=₁w_i,jy_i,j−_β∆x_i,t). The weight matrixCN depends on whether the one-step or two-step GMM procedure is used (Bouayad-Agha & Verdrine, 2010).

We observe β−convergence if the estimated coefficient α in equation (4) is smaller than one. In that case, the value ofy_i,t−1 has a negative effect on the growth of y_i between t−1 and t. The lower the coefficient of α, the higher convergence speed we observe (Bouayad-Agha &

Verdrine, 2010).

Bouayad-Agha & Verdrine (2010) estimate β−convergence using the log GDP per capita of regional data in the period 1980-2005 of the EU-15 countries based on a model without spatial effects, a model with spatial effects of the dependent variable, a model with spatial effects of the dependent variable and spatial effects of a lag of the dependent variable, and a model with a spatial error term. They find significant β−convergence in all models, but the convergence speed is much higher in models with spatial effects. Arbia et al. (2008) find similar results using many different models, from a simple cross-sectional model to a GMM model with spatial effects. Kilinc et al.

(2017) use the log GDP per capita on country level. They observe significant β−convergence for the EU-15 countries with the GMM model but without spatial effects, using data between 1963- 2012.

Instead of GMM, β−convergence can also be measured using FE models. Barro (2015) use as a starting point for the FE model

y_i,t =αy_i,t−1+βx_i,t+γi+e_i,t. (9)

Equation (9) is similar to equation (4), but spatial effects are not included due to endogene- ity. Barro (2015) calculates the FE estimator after demeaning the data, where the average log GDP per capita value of all years of the corresponding country are subtracted from the original value.

This procedure must be followed for both dependent and independent variables. Then we can write the model as

(15)

˜y_i,t =_{α ˜}y_i,t−1+_{β ˜}x_i,t+˜e_i,t, (10)

where ˜y_i,t=y_i,t−_T¹_∑^T_t₌₁y_i,t. The country-specific effects are no longer in the equation, as ˜γi= γ_i− _T¹_∑^T_t₌₁γ_i =0. In this way, we again allow for country-specific effects, without including them explicitly in the model. Barro (2015) estimates the FE model using the OLS objective function

(ˆα, ˆβ) =arg min

(α,β) 1

NT∑_i^N=1∑^Tt=1(˜y_i,t−α ˜y_i,t−1−β ˜x_i,t)². (11)

Barro (2015) estimates β-convergence of the log GDP per capita using OLS with and with- out fixed effects on a dataset with 89 countries in the period 1960-2010. He finds β-convergence in both regressions, but convergence is stronger when fixed effects are included. Fiaschi et al. (2018) estimate β-convergence for the log GDP per worker using a FE model with spatial effects for the period 1991-2008. They find significant β-convergence in the EU-12 regions and show that the EU cohesion policy is effective. Tselios (2009) uses OLS and maximum-likelihood (ML) models with and without fixed effects for 102 EU regions in the period 1995-2000. They find significant β-convergence of the income per capita in all models with fixed effects included. Ramajo et al.

(2007) find significant β−convergence of the log GDP per capita the of regions in 12 EU countries using OLS and ML methods. They observe stronger convergence in relatively poor EU countries, which receive funds from the EU cohesion fund. This indicates that the policy of the EU leads to more convergence between the member states.

Besides Barro (2015), more researchers find β−convergence using data outside Europe.

Young et al. (2018) observe β−convergence for all United States (US) counties together and for the counties in each state separately. However, β−convergence is not significant in some states.

Hembram & Haldar (2019) find unconditional β−divergence, but conditional β−convergence for 22 Indian states between 1980 and 2016. The most important variables, which explain these op-

(16)

posite results are the bank deposit per capita and the physical infrastructure index. Dey & Neogi (2015) observe unconditional β−convergence in different time periods for 8 Asian countries, in- cluding China. They find much stronger β−convergence In the period 1985-2010 than in the period 1970-1985.

However, Hurwicz (1950) and Nickell (1981) argue that the FE estimator, stated in equa- tion (11), is biased. Hurwicz (1950) shows that the error term e_i,t appears in the sample mean of y_i,t−1, such that a higher e_i,tleads to a higher sample mean ofy_i,t−1, such that ˜y_i,t−1is lower. This implies a negative covariance between ˜y_i,t−1 and e_i,t, which leads to a downward bias of the FE estimator, such that convergence is overestimated. Nickell (1981) provides a formula for this bias, and Barro (2015) adjusted this formula to a convergence setting. This formula for the Nickell bias can be written as

ˆν−ν

ν ≈ ²⁽^e^−νT∗⁻¹⁺^νT^∗⁾

ν²T^∗2−2(e^−νT∗−1+νT^∗), (12)

where ν is the magnitude of the yearly convergence rate, and T^∗is the time span of the data in years.

The coefficient of the lagged dependent variable α in the models above can be written as α=e⁻^ντ, where τ is the length of one period in years. If we observe convergence, ν is positive, and the bias is positive as well, such that we overestimate the speed of convergence. The bias size only depends on the product νT^∗, so the bias does not change if the period length τ changes, keeping the time span T^∗ fixed. Therefore, increasing the data frequency by reducing the period length from, for example, five years to one year does not lead to a bias reduction. When the time span of the sample approaches infinity, the bias disappears. However, when the sample has a reasonable length and we observe strong and significant convergence, this bias is not negligible.

Kiviet (1995) propose a biased-corrected FE estimator using Monte Carlo simulations. He makes an analytical approximation of the sample bias in a dynamic panel data model, such that a substantial part of the bias can be removed. However, this biased-corrected FE estimator is derived under a set of conditions, such as homoskedasticity, which are often violated in practice. Everaert

(17)

& Pozzi (2007) derive a bootstrap procedure to correct for the bias, where no analytical expression for the bias of the FE estimator is needed, as it is calculated numerically using bootstrap samples.

Their Monte Carlo analysis shows that their biased-corrected FE estimator is similar to the one of Kiviet (1995), but much better applicable.

In this research, we use the split panel jackknife (SPJ) estimator, developed by Dheane &

Jochmans (2015) to correct for the Nickell bias. The SPJ estimates the bias by comparing the ML estimate of the full sample with the estimates of two subsamples. The SPJ estimator is easy to use, as it requires only a method to compute ML estimates, where no analytical computations and explicit characterization of the Nickell bias are needed. Dheane & Jochmans (2015) argue that the standard FE effects estimator is often inconsistent when N→∞, and T remains fixed due to the Nickell bias. They show that the asymptotic bias of the standard FE estimator equalsO(T⁻¹), and therefore only consistent if both N, T→∞. However, the SPJ removes the O(T⁻¹) term in the bias and makes it only O(T⁻²). Therefore, in terms of bias, the SPJ is a much better estimator than the standard FE estimator (Dheane & Jochmans, 2015). The procedure to compute the SPJ estimator can be found in the methodology section.

2.2.3. σ−convergence

σ−convergence is introduced by Quah & Barro (1992), and it refers to the process where disparities of countries reduce over time. We observe σ−convergence if the standard deviation (SD) and the coefficient of variation (CV), which is the standard deviation divided by the mean, of the log GDP per capita decrease over time. We observe σ−divergence if the measures increase over time (Ram, 2018). σ−convergence is a stronger definition than β-convergence, as shocks are no longer allowed. Moreover, if the disparities keep decreasing heavily, different steady-states are not possible. β-convergence is necessary, but not sufficient for σ−convergence (Monfort, 2008).

The formulas for SD and CV can be written as

SDt = [_N¹ _∑^N_i₌₁(GDP_i,t−GDPt)²]^0.5, (13)

CVt=SDt/ GDPt, (14)

(18)

whereGDP_i,t is the log GDP per capita of country i in year t, and GDPt= _N¹ _∑^N_i₌₁GDP_i,t is the average log GDP per capita of all countries in year t. Dalgaard & Vastrup (2001) show that both measures are weighted sums of growth rates of the log GDP per capita, but with different weights.

Therefore, the measures can not be used interchangeably, and it can be insightful to calculate both.

Monfort (2008) plots CVt for the EU-15 and EU-27 countries between 1980-2005 and 1995-2005, respectively They observe a clear negative relationship of the coefficient of variation over time. Simionescu (2014) calculates both measures for the EU-27 countries for the period 2000-2012, using formulas (13) and (14), and a version where each country gets a weight propor- tional to the population. In all calculations, the measures decrease slowly over time, which indicates that σ−convergence is present. However, the coefficient of variation is still larger than 0.4 in 2012, which indicates that strong deviances still exist and no convergence into a single steady-state.

σ−convergence is also investigated using non-European data. Young et al. (2018) observe significant σ−divergence for the US counties, where the standard deviation of all counties increase from 0.2728 in 1970 to 0.2887 in 1998. significant σ−divergence is also present within the counties of the majority of the individual states. Hembram & Haldar (2019) find a clear upwards trend for the standard deviation of the log GDP per capita of the Indian states in the period 1980-2015, meaning that σ−divergence is observed there. Dey & Neogi (2015) observe σ−convergence for eight Asian countries, including China. The coefficient of variation of these countries decreases from 0.6189 in 1970 to 0.1225 in 2011.

Monfort (2008) also mention some other methods to calculate σ−convergence, for example, the Gini coefficient and the Atkinson index. However, relatively few papers use these measures, so we do not take them into account.

2.3. Research questions

We test whether some groups of countries converge to the same steady-state and how these groups are clustered. If the EU has a significant influence on the convergence process, it could be the case that EU and non-EU countries form separate convergence clubs.

(19)

Hypothesis 1: Convergence clubs exist among European countries, and a clear separation between EU and non-EU countries can be observed.

In the European Union subsection, we give some arguments why the economies of EU countries should converge. In this research, we test the convergence of the EU countries in practice.

Hypothesis 2A: The economies of the countries in the EU converged after the EU was formed.

Many researchers observe convergence in the European regions or countries, but there is no evidence yet whether non-EU countries in Europe also converge. To see whether the EU could be a reason for the existence of convergence, we test the same hypothesis for non-EU countries in Europe.

Hypothesis 2B: The economies of the countries in Europe but outside the EU converged after the EU was formed.

The policies of the EU potentially increase the speed of convergence. Therefore, except for the existence of convergence, we investigate the speed of convergence as well.

Hypothesis 3: The convergence speed is larger for countries in the EU compared to non-EU countries in Europe.

3. METHODOLOGY

In this section, we discuss the methodology of this paper. First, we explain the data by indicating which variables are used and how the dataset is established. Second, we mention how the spatial effects variable is created. Third, we discuss the different methods to indicate convergence.

(20)

3.1. Data

We use a panel dataset of 28 EU member countries and 16 European (but not EU) countries

1. We treat The United Kingdom as an EU country in this research, as it was an EU country until 2020, and therefore an EU country in our complete panel. The European countries Andorra, Kosovo, Liechtenstein, Monaco, San Marino, and Vatican City are not included in this research, as these countries are too small and lack data.

In this research, we compare the convergence of EU and non-EU countries after the start of the EU. Therefore, we use data in the period 1994-2018. In the regression models, we do not use the data of the first five years, as the financial structures of the countries changed heavily after these years, which can be seen in the results section. In the GMM models, we group the data in five- year periods (1999-2003, 2004-2008, 2009-2013, 2014-2018) to avoid short-run variations due to business-cycle effects, following the research of Bouayad-Agha & Verdrine (2010). We use the average value of the available yearly data to create these five-year periods, which gives a dataset with four observations for each of the 44 countries. In this way, the number of periods is small compared to the number of countries, such that GMM is an appropriate estimation method.

However, we lose part of the information in our data when we group the years, as we only use one-fifth of the number of available observations, where we do not take the yearly changes into account. We have 20 observations for each of the 44 countries when we use each year as an observation. The number of periods is then quite large compared to the number of countries. When we use subsets of the data to compare the EU and non-EU countries, we even have more years than countries in some subsets. Therefore, we prefer FE estimation in this case.

1The 27 current member states of the European Union are: Austria, Belgium, Bulgaria, Croatia, Cyprus, Czech Re- public, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, The Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, and Sweden. The 23 countries that are currently not EU members but are located in Europe are: Albania, Andorra, Armenia, Azerbaijan, Belarus, Bosnia and Herzegovina, Georgia, Iceland, Kosovo, Liechtenstein, Moldova, Monaco, Montenegro, North Macedonia, Norway, Russia, San Marino, Serbia, Switzerland, Turkey, Ukraine, United Kingdom, and Vatican City

(21)

In this research, we measure convergence using the log GDP per capita, defined as the log- arithm of the total GDP divided by the midyear population. We measure the log GDP per capita in US dollars at constant 2010 prices to correct for inflation. The log GDP per capita is the most commonly used measure for country convergence in literature, for example, used by Monfort (2008) and Bartkowska & Riedl (2012). Ram (2018) argues that the log GDP per capita is the primary variable representing income, and therefore the best variable for convergence research. However, the methods we discuss in this paper can also be applied to different economic measures.

Following the research of Lyncker & Thoennessen (2017), we categorize the control variables in initial conditions and structural characteristics. An essential structural characteristic and requirement for convergence is a similar production technology (Galor, 1996). Therefore, we use the high-tech part of the total manufacturing and the part of the GDP obtained from services as control variables, following the research of Bartkowska & Riedl (2012). Moreover, we include the part of the GDP obtained from agriculture and industry as control variables, following the research of Lyncker & Thoennessen (2017). Mora (2008) argues that the labor market characteristics are also crucial economic growth determinants and that the female unemployment rate is a good indicator for the labor market. The population growth rate (Mora, 2008) and population density (Corrado et al., 2005) are also essential structural growth determinants, which are included in many papers.

Kilinc et al. (2017) mention that the trade openness and the inflation rate could affect the financial development convergence as well.

Galor (1996) argues that economies with identical structural characteristics only converge if they have the same initial conditions. Therefore, it is also essential to include initial conditions as control variables. If savings only result from wages, the initial level of the capital-labor ratio can determine in which steady-state an economy ends up (Galor, 1996). To control for differences in factor endowments, we include the initial level of the labor force and human capital in our model, following many economic convergence papers. Azariadis & Drazen (1990) argue that rapid growth can only occur with a high level of human investment relative to income. Therefore, we use the initial log income per worker as a control variable as well, following the research of Bartkowska

(22)

& Riedl (2012).

We obtain most data from the World Development Indicators Database of the World Bank.

We extract most variables directly from the World Bank database and create some variables by combining World Bank variables. We obtain human capital, which we measure as the average years of schooling, from the Our World in Data database.

We compare the results of countries in the EU and countries in Europe but outside the EU. However, one could argue that some countries outside the EU should also be treated as EU countries. Iceland, Norway, and Switzerland are also part of the Schengen countries and have trade agreements with EU countries. Therefore, we compare the EU countries with the non-EU countries, and the EU countries with the Schengen countries included with the non-EU countries with the Schengen countries excluded.

A detailed description and summary statistics of the variables can be found in Appendix A.4 and Appendix A.5, respectively. More information about data preparation is stated in Appendix A.6.

3.2. Spatial effects

To create a variable that contains spatial effects, we first make the weight matrixW. The centroids and capitals of the European countries can be seen in Figure 1. We obtain the longitudes and latitudes of the locations using GeoPandas in Python. Most literature, for example, Ramajo et al. (2007) suggests using weights based on the great circle distance between the centroids of the countries. However, some countries do not have the most inhabitants and activity around their centroid. Norway and Finland have, for example, most of their activity around their capital city in the South of the country. Besides, the centroid of Russia is far away from the other European countries, such that many weights would be very low or set equal to zero if we use centroids.

Therefore, we prefer to use capitals instead of centroids. It would not make a large difference for most countries, but it makes much more sense to use capitals instead of centroids for some countries. We calculate the great circle distances using the longitudes and latitudes of the capitals.

The formula for the great circle distance can be found in Appendix A.7.

(23)

Figure 1: Centroids and Capitals of Europe

Using the great circle distances, we calculate the elements in matrixW by the formula

w_i,j= ^w

∗ i,j

∑^N_j=1w^∗_ij, w^∗_i,j=











0, if i=j

1

d_i,j, if d_i,j<D, 0, if d_i,j>D

(15)

where w_i,j is element ij of Matrix W, which is the spatial weight between countries i and j, and d_i,j is the great circle distance between the capitals of country i and j. As distance cutoff D, we use the median of the great circle distances, following the research of Ramajo et al. (2007). We use _d¹

i,j instead of the suggested _d¹₂

i,j

by most literature, as we find more significant spatial effects using these weights when we look at Moran’s I statistic. This indicates that countries that are not very close to each other could still have moderate spillover effects. The formula for Moran’s I statistic and the test for significant spatial effects can be found in Appendix A.8. We do not find significant spatial effects if we test the whole sample, but we find significant spatial effects in some subsamples. For example, if we only consider Austria, Belgium, France, Germany, Italy, United Kingdom, and Luxembourg, then the spatial effects are significant. Therefore, we include spatial effects in our models for β-convergence.

(24)

3.3. Convergence

For the convergence clubs and σ-convergence, we follow the methods discussed in the literature review section. In this subsection, we explain the GMM and FE models, which we use to measure β-convergence.

3.3.1. GMM estimation

In the research of Bouayad-Agha & Verdrine (2010), which we discuss in the literature review section, the log GDP per capita values themselves are used as the dependent variable in their GMM regressions. We use the growth of the log GDP per capita as the dependent variable in our models. As we group the data in 5-year periods for the GMM regression, the dependent variable y_i,sis the average yearly growth of the log GDP per capita of country i over the years in the 5-year period. This variable can for the first period be written as

y_i,1= ¹₅((GDP_i,1999−GDP_i,1998) + (GDP_i,2000−GDP_i,1999) + (GDP_i,2001−GDP_i,2000) + (GDP_i,2002−GDP_i,2001) + (GDP_i,2003−GDP_i,2002)) = ¹₅(GDP_i,2003−GDP_i,1998), (16)

whereGDP_i,t is the log GDP per capita of country i in year t. The four periods s contained in the model are (1999-2003), (2004-2008), (2009-2014), and (2014-2018), respectively. The variable z_i,s is the average log GDP per capita value of country i over the years in period s and indicates whether convergence is present. This variable can for the first period be written as

z_i,1 =¹₅(GDP_i,1999+GDP_i,2000+GDP_i,2001+GDP_i,2002+GDP_i,2003). (17)

The variables y_i,s and z_i,s for the periods s=2, 3, 4 can be written in a similar way. Using these variables, we define the model

y_i,s=_αz_i,s₋₁+_ρ_∑^N_j₌₁w_i,jz_j,s+_βx_i,s+_e_i,s_, (18)

(25)

where y_i,s and z_i,s are the variables stated above, ∑^Nj=1w_i,jz_j,s is the spatial control variable, and x_i,scontains the average values of the other control variables of country i over the years in period s. The spatial variable controls for the effect of the log GDP per capita values of the neighboring countries on the log GDP per capita growth. As we use a lagged value of the average log GDP per capita, we only use three observations for each country in the model. We estimate the model using equations (5)-(8).

If the coefficient α is negative, larger previous log GDP per capita values lead to lower log GDP per capita growth, which means that we observe β−convergence. The more negative the coefficient α is, the stronger the β−convergence we observe.

Each control variable we add in the GMM model leads to extra instruments, according to equation (3). However, having too many instruments causes the Hansen test to be weak (Hansen, 1982). The often-used rule of thumb is to use at most the same number of instruments as the number of countries in the sample (Mileva, 2007). Therefore, we only use the most crucial explanatory variables in the GMM models. Bouayad-Agha & Verdrine (2010) states that investment and population growth are the most crucial variables for the explanation of the log GDP per capita growth, which is also in line with the empirical growth literature of Durlauf et al. (2006). We use the part of the expenditures in the industry sector as an instrument for investment.

The Hansen tests for exogenous instruments are satisfied if we use the spatial variable as an endogenous explanatory variable and the other variables as strictly exogenous variables. When needed, we collapse the instruments to make sure that the number of instruments does not exceed the number of countries.

In this research, we use both one-step and two-step GMM estimators. In the one-step procedure, we estimate the weight matrixC_N in equation (8) directly. In the two-step procedure, we estimate equation (8) first usingC_N =Ir, where Ir equals the identity matrix with shape equal to the number of moment conditions r. We use the corresponding GMM estimator to find the opti- mal weight matrix C_N, and we estimate equation (5) again using this matrix (Arellano & Bond,

(26)

1991). The weight matrix of the one-step GMM estimator is independent of the estimated parameters. However, the two-step GMM estimator uses a consistent estimate of the covariance matrix by weighting the moment conditions (Windmeijer, 2005). However, Windmeijer (2005) shows that the standard errors could be severely biased in a two-step GMM model with a small sample. How- ever, we use his corrected variance estimator, such that inference is more accurate. In both models, we use a robust estimation method, such that the errors are robust to panel-specific autocorrelation and heteroskedasticity (Roodman, 2009).

3.3.2. FE estimation

In the literature review section, we discuss some methods to construct an unbiased FE estimator. In this research, we use the SPJ estimator, based on ML estimates of the full sample and subsamples, which is developed by Dheane & Jochmans (2015). Consider the observations of the log GDP per capita growthy_i,t, which we can write as

y_i,t =GDP_i,t−GDP_i,t−1, (19)

whereGDP_i,tis the log GDP per capita of country i in year t withi=1, ..., N and t=1999, ..., 2018.

We can write the FE model as

y_i,t =αGDP_i,t−1+βx_i,t+γ_i+e_i,t, (20)

wherey_i,tdepends linearly onGDP_i,t−1, the control variablesx_i,t described in Appendix A.4, and the country-specific effects γ_i. No spatial effects variable is included in the model, as this variable is endogenous. Soy_i,t has a linear density f(y_i,t; θ,γi), where θ= [α, β] contains the coefficients of the explanatory variables. A negative coefficient α indicates β-convergence, in the same way as in the GMM model, but now using yearly data instead of average yearly data over 5-year periods.

We can write the standard FE estimator ˆθ as

(27)

ˆθ=arg max

θ

ˆl(θ), (21)

where

ˆl(θ) = _NT¹ _∑^N_i₌₁_∑^T_t₌₁log(f(y_i,t; θ, ˆγi(θ))), (22)

and

γˆi(_θ) =arg max

γi

1

T∑^Tt=1log(f(y_i,t; θ, γ_i)). (23)

The SPJ estimator uses the standard FE estimator ˆθ, but also ML estimates of subsamples of the data. The FE estimator met SPJ bias-correction ˜θ can be written as

˜θ=2 ˆθ−¹₂(ˆθ₁+ ˆθ₂), (24)

where ˆθ is the ML estimate of θ, which is calculated using the full sample. ˆθ₁ and ˆθ2 are the ML estimates using the first and second half of the observations t=1, ..., T/2 and t=T/2+1, ..., T of each country, respectively. So the two panels consist of the years 1999-2008 and 2009-2018, respectively. This can lead to quite different estimators ˆθ₁ and ˆθ2, as the financial structure of countries can change over time.

4. RESULTS

In this section, we discuss the empirical results of this paper. The log GDP per capita of 1994 against the average annual percentage change between 1994 and 2018 of all European countries can be seen in Figure 2. We find a clear downwards pattern, which is a first indication that highly developed countries grow less compared to relatively less developed countries at the initial point in time, such that the disparities between the countries decrease. This pattern seems to

(28)

Figure 2: Log GDP per capita growth

(29)

Figure 3: Mean value of the log GDP per capita

(a) EU and non-EU (b) EU with and non-EU without Schengen

hold for both EU and non-EU countries, but EU countries have a higher initial log GDP per capita in general.

Figure 3a compares the development of the log GDP per capita of EU and non-EU countries, where we also see a higher initial value for the EU countries. The EU and non-EU countries seem to develop in a similar pattern over time. The log GDP per capita grows gradually over time, but only during the financial crisis of 2008, the log GDP per capita decreased. In Figure 3b, the non-EU countries that are part of the Schengen countries are treated as EU countries as well. The log GDP per capita of the EU countries barely changed, but the log GDP per capita of the non-EU countries reduced. This indicates that Iceland, Norway, and Switzerland have a GDP per capita, which is on average comparable to the values of the EU countries, but higher than the remaining non-EU countries.

The remaining part of this section is structured as follows. First, we test the club convergence hypothesis to determine whether some convergence clubs exist in our data. Second, we show the results of the β−convergence methods, using GMM and FE models. Finally, we inves- tigate whether σ−convergence is present in our whole sample and the EU and non-EU samples separately.

(30)

4.1. Convergence clubs 4.1.1. Creating the convergence clubs

In this subsection, we discuss the results of the convergence club hypothesis test, developed by Phillips & Sul (2007) and extended by Lyncker & Thoennessen (2017). We obtain the results by implementing their algorithm in Python and applying it to our dataset. Moreover, we discuss the logistic regression results to understand why countries are in a particular convergence club.

First, we test the null hypothesis on the whole dataset. Our t-statistic equals -9.909, so we reject the null hypothesis, which means that not all countries converge to the same steady-state.

Therefore, we perform the club convergence test on subsets of the dataset to determine whether subgroups exist, which converge to the same steady-state.

Table 1: Countries in each convergence club

Club 1 Club 2 Club 3 Club 4 Club 5 Club 6

Luxembourg Sweden Germany Romania France Albania Ukraine

Switzerland Denmark Finland Greece Hungary Montenegro Moldova

Ireland The Netherlands* Austria Spain Croatia Bosnia and Herz.

Norway Slovak Republic* Poland* United Kingdom Portugal Serbia

Azerbaijan* Lithuania* Belgium* Czech Republic Russia North Macedonia

Iceland* Malta* Estonia Armenia*

Turkey* Italy Bulgaria*

Latvia* Slovenia Belarus*

Cyprus Georgia*

0.468 7.611 0.982 3.858 1.807 2.708

Note: The table contains the countries included in each convergence club before the merging algorithm is applied.

The first countries in each club are the countries that form the core, and the countries denoted by * are added based on the threshold value c=0. The t-statistics of the convergence clubs can be found at the bottom of the table.

We form the convergence clubs using the 4-step algorithm described in Appendix A.3, and these clubs can be found in Table 1. We find six convergence clubs and no diverging countries. We notice that the countries in a particular convergence club do not necessarily have comparable log GDP per capita values. For example, Azerbaijan has a much lower log GDP per capita than the

(31)

other countries in the first convergence club. This can be explained by the fact that the log GDP per capita of Azerbaijan increased heavily in the period 1994-2018. Therefore, it has the potential to converge to a similar level as the other countries in the first convergence club. However, it can also be the case that Azerbaijan’s GDP per capita growth stagnates and converges to a lower GDP per capita value. We observe similar patterns for some countries in the other convergence clubs.

Table 2: Merging the convergence clubs

Step Clubs 1+2 Clubs 2+3 Clubs 3+4 Clubs 4+5 Clubs5+6 Conclusion

1 0.553 1.411 0.094 -0.255 -5.855 Merge clubs 2 and 3

2 0.163* -3.634 -0.225 -5.855 Merge clubs 1-3

3 -2.835** -0.225 -5.855 Merge clubs 4 and 5

4 -7.215*** -4.940**** Algorithm ends

Note: The table contains the t-statistics of the log t test of all adjacent convergence clubs, following the algorithm of Lyncker & Thoennessen (2017). * is the t-statistic of clubs 1-3, ** is the t-statistic of clubs 1-4, *** is the t-statistic of clubs 1-5, and **** is the t-statistic of clubs 4-6.

The t-statistics of the merging algorithm of Lyncker & Thoennessen (2017) can be found in Table 2. From this algorithm, we see that convergence clubs 1-3 can be merged into a rich convergence club, convergence clubs 4 and 5 into a medium convergence club, and convergence club 6 forms the poor convergence club. These convergence clubs consist of 19, 23, and 2 countries, respectively.

In Figure 4, we see that most Northern European countries belong to the rich convergence club, and most Southern European countries belong to the medium convergence club, which cor- responds to the research of Lyncker & Thoennessen (2017). The rich convergence club contains 14 EU and 5 non-EU countries, the medium convergence club contains 14 EU and 9 non-EU countries, and the poor convergence club only contains 2 non-EU countries. The non-EU Schen- gen countries are all in the rich convergence club, so if we count all Schengen countries as EU countries, the rich convergence club contains 17 EU and 2 non-EU countries. This means that the EU countries are more likely to belong to the rich convergence club and the non-EU countries to belong to the medium convergence club, but we do not observe a strict separation between these

(32)

Figure 4: Convergence clubs for the years 1994-2018

groups of countries.

4.1.2. Sensitivity of the convergence clubs

The outcome of the convergence club algorithm depends on some parameters that we must choose. The threshold value c determines the number of countries added to the core groups, and the HAC estimator depends on the chosen kernel and bandwidth. Moreover, not using all available data but using an alternative starting and ending year can also lead to different results.

We use threshold value c=0 in the algorithm, which is the suggested value by Phillips &

Sul (2009). Using a lower value of c leads to more countries in each convergence club and fewer convergence clubs in total. However, the merging algorithm merges fewer convergence clubs, such that the results after the merging algorithm are still quite similar to the results when c=0. When

(33)

c is slightly below zero, we still have three convergence clubs with the same or almost the same countries in each club. When c is very low, but still high enough to not reject convergence in all converge clubs, only the rich and medium convergence clubs are left, with Moldova in the medium convergence club and Ukraine as a diverging country. When c gets larger than zero, more convergence clubs are initially be formed, but we always end up with three or four convergence clubs after applying the merging algorithm. Ukraine is sometimes a member of the poorest convergence club and sometimes a diverging country. When c is very high, no additional countries are added to the core group, such that only countries with a similar log GDP per capita value in the last year are in the same convergence club. For all values of c, EU countries are more likely to end up in one of the richest converge clubs, and non-EU countries are more likely to end up in one of the poorest converge clubs.

We estimate the t-statistics using the HAC variance-covariance estimator. The kernel function and the bandwidth characterize the HAC estimator. We use the Bartlett kernel function in our calculations, following the research of Newey & West (1987). A different kernel function can lead to quite different results. For example, when we use a uniform kernel, Iceland belongs to the second instead of the first convergence club, and Slovak Republic belongs to the third instead of the second convergence club. All other countries still belong to the same convergence club, stated in Table 2. After performing the merging algorithm, these changes lead to 4 instead of 3 convergence clubs, as clubs 1 and 2, clubs 3 and 4, club 5, and club 6 form a convergence club. Although the convergence clubs are different, it is still the case that EU countries are more likely to belong to the richest convergence club than the second richest convergence club. The two poorest convergence clubs only contain non-EU countries. The bandwidth equals three in our calculations. Using different bandwidths lead to slightly different t-statistics but not to different convergence clubs. Only when we use an extremely high or low bandwidth, the convergence club results change.

Den Haan & Levin (1998) propose a vector autoregressive (VAR) model to construct a pre-whitened HAC estimator. They show that their VARHAC estimator converges faster than the kernel-based estimator. We determine the order of VAR using the Akaike Information Criterion