Trade Blocs and the Global Digital Divide: A Spatial Panel Data Approach

(1)

Trade Blocs and the Global Digital Divide:

A Spatial Panel Data Approach

ABSTRACT: In order to get a better understanding of worldwide Internet usage differences, spatial interaction effects are added to a model explaining cross-country growth in Internet usage. The paper finds that ICT infrastructure growth has a positive and significant effect on Internet usage growth in one’s own country as well as in other countries. The findings suggest that the efficiency of policies aimed at decreasing the global digital divide can be increased if they are initiated on a trade bloc level. Contrary to earlier papers no significant role for income in explaining cross-country Internet usage differences is found.

KEYWORDS: Digital divide, Internet, Spatial modelling, ICT infrastructure

JEL: O21, O32, O33

Name author: Kasper J. Mulder

Student ID number: 1896806

Supervisors: J.P. Elhorst and N. Grau

Date: June 2015

Master Thesis Economics, University of Groningen

(2)

There exists a growing concern about the worldwide digital divide. This divide is defined by the OECD (2001, p.4) as “the gap between individuals, households, businesses and

geographic areas at different socio-economic levels with regard to their opportunities to access information and to their ability to use the Internet for a wide variety of activities”.

In the literature most references to the digital divide are equated to the worldwide differences in access to Internet. The literature mostly uses number of Internet users (Beilock and Dimitrova, 2003; Chinn and Fairlie, 2006 and Guillén and Suárez, 2005), and number of Internet hosts (Hargittai, 1999; Kiiski and Pohjola, 2002 and Oyelaran-Oyeyinka and Lal, 2005).

Figure 1 shows the average Internet usage over the period 2000-2009 for high, middle and low income countries, as classified in the year 2000 by the World Bank (Soubbotina and Sheram, 2000). It shows that Internet usage in both high income countries and middle income countries has grown at a fast pace. However, the gap between the high income and middle income countries is substantial and has stayed quite constant over the period. The gap between the high income countries and low income countries is even more striking. Also, the low income countries on average show no growth rates, further increasing this gap between them and high and middle income countries. International agencies like the World Bank, UN and OECD have expressed growing concern that the increase of the gap between developing and developed countries may leave many nations economically behind (World Bank Group strategy for ICT, 2012; International Telecommunication Union, 2012 and OECD, 2001)

Figure 1: Average Internet usage as percentage of total population for the period 2000-2009

(3)

This concern is linked to the fact that the widespread access to Internet has become a key driver of country competitiveness and economic growth (Röller and Waverman, 2001; Kenny, 2003 and Czernich, Falck, Kretschmer and Woessmann, 2009). The Internet has great promises in increasing productivity and improving accountability and governance. It is increasingly determining the ability of individuals, firms and territories to remain competitive by producing and working more efficiently.

Furthermore, the Internet is recognized as being able to help reduce poverty directly and indirectly by providing access to information, equalizing opportunities in rural areas and contributing to pro-poor market developments such as microfinance and mobile money (Madon, 2000; World Bank, 2012). Inequality in worldwide diffusion of Internet may therefore have serious implications for worldwide differences in economic growth and human development.

Following the recognition of the importance of the worldwide Internet diffusion for the economy, governments and organizations initiate many programs to increase Internet access of the people. The World Bank Group strategy for ICT (2012) acknowledges that worldwide the focus of universal access policies and programs changed from voice telephony to broadband for high-speed Internet. Current universal access programs however show mixed results. Government IT spending is considered having a high risk-high reward nature. The World Bank Group strategy for ICT finds that only 59% of the World Bank’s IT project components have achieved or are likely to achieve their objectives fully or substantially.

In order to effectively deal with the worldwide digital divide the success rate of programs aimed at increasing Internet usage in countries therefore must increase. For this it is essential to know which factors are the main drivers behind the growth of a country’s Internet access. Understanding the determinants of worldwide Internet diffusion is therefore of great interest.

To date, cross-country literature has mainly focused on national factors affecting the Internet usage growth. However, given the highly inter-dependent nature of Internet as well as the globalization of most economies, Internet usage in a country is likely to also be determined by spatial interaction between countries. In this context, this paper contributes to the existing literature by capturing this spatial component using spatial modelling.

(4)

period. Where for all non-developed countries the percentage of Interent users of total population was less than 10% in 2000, many reagional differnces can be seen in 2009. For the non-developed countries, the highest growth toke place in South-America and East-Europe, followed by Central Asia and the Arabic countries. Internet usage in Central American, Sub-Sahara and South Asian countries stayed behind, still being used by less than 10% of the population for almost all countries in those regions.

Figure 2: Worldwide Internet usage as percentage of total population by in the years 2000 and 2009

Year: 2000

Year: 2009

Source: Data from the UN International Telecommunication Union

To capture the spatial component in the model explaining growth in Internet usage two different hypotheses are tested in this paper. The first hypothesis deals with spatial interaction effects between countries based on geographical factors. The first hypothesis is tested using a first-order binary contiguity matrix and states:

(5)

Internet usage in a country is affected by domestic effects as well as spillover effects from neighboring countries

The motive behind extending the model with spatial interaction effects based on geographical factors is that the possibilities of the Internet increase with the number of people using it. Although distance does not affect the working of Internet itself, economic activity associated with Internet usage is affected by distance.

If neighboring countries increase the usage of Internet, this increases the possibilities of Internet usage for a country in various ways. Items can be sold via the Internet in neighboring countries and vice versa, companies and organizations operating in both countries are able to improve their processes using Internet and mutual problems like crime fighting and natural disasters can be better coordinated, among many other possibilities.

The second hypothesis deals with spatial interaction effects between countries sharing a great deal of economic activity. In this paper economic activities is measured as trade flows between countries. The second hypothesis states:

Internet usage in a country is affected by domestic effects as well as spillover effects from its trading partners

The motive behind this spatial extension is that if a country’s trading partners make little use of Internet, the necessity of Internet usages for economic activities in that country is low. On the other hand, if the trading partners’ economies are highly dependable on the Internet, a country also needs to adopt an Internet based economy in order to efficiently trade with its partners.

This hypothesis is tested using three different spatial matrices. One matrix is based on the main export partners and main import partners over the sample period. The second matrix is based on relative trade between all countries in the base-year 2000. The last matrix is a block-diagonal matrix based on worldwide trade blocs.

(6)

for ICT infrastructure in explaining Internet usage. These results indicate that Internet usage in a country is affected by both domestic variables and foreign variables.

The paper proceed as follows. Section 1 discusses the main literature on the determinants of worldwide differences in Internet usage. Thereafter, section 2 discusses the theory on spatial modelling. Section 3 discusses the data used in this paper. In section 4 first a benchmark model is constructed on the basis of the variables discussed in the main literature. Second, the spatial models are discussed. Furthermore, based on economic theory, the different W-matrixes used for the spatial models are introduced. Hereafter, section 5 discusses the methodology. Section 6 discusses the results. Section 7 elaborates on the policy implications following the results. Finally, section 8 discusses the limitations and further research opportunities, where after in section 9 the conclusion of this paper is stated.

1.

LITERATURE REVIEW

Many different variables have been identified as determinants of Internet usage in a country. This section discusses the main literature on cross-country Internet usage.

Hargittai (1999) is considered the first major research on explaining differences in Internet connectivity. Focusing on OECD countries only, he finds significant positive effect from GDP per capita and phone density. Furthermore, he finds a significant negative effect of the existence of a monopoly in the telecom sector, suggesting an important role for telecommunication policy. Education, pricing and income distribution, measured by the GINI coefficient, do not appear to have a significant influence on Internet usage in his model.

(7)

For the larger sample the results change so that investment in education does become significant. However, data on Internet access cost is excluded as it is not available for the developing countries. Instead telephone tariffs access cost is used, which is only significant if controlled for technological infrastructure.

Beilock and Dimitrova (2003) also find that per capita income is the most important determinant of Internet diffusion. Other important determinants they find are ICT infrastructure and openness of a society, measured using a Freedom and non-Freedom dummy. For the other non-economic factors than openness they do not find a significant relationship.

Oyelaren-Oyeyinka and Lal (2003) use cross-country data on Africa alone. In accordance with previous studies they find a significant positive influence of GDP per capita and ICT infrastructure, measured using number of telephone lines and computers per capita. Furthermore they find a significant and positive coefficient for education.

Guillen and Suarez (2005) research the determinants of cross-national differences in Internet use using data of 118 countries between 1997-2001. They also find a significant effect of GDP per capita and number of telephone lines. Contrary to Kiiski and Pohjola (2002), they do find a significant positive effect for competition in the telecommunication sector.

Wunnava and Leiter (2009) employ cross-sectional data from 100 countries to analyze the main determinants of inter-country Internet diffusion. They identify economic strength, measured as GDP per capita, telecommunication and ICT infrastructure, English proficiency and a country’s political and economic openness as the fundamental factors in determining worldwide internet diffusion. In addition, they find tertiary school enrollment and income equality to play a significant role.

Andrés, Cuberes, Diouf and Serebrisky (2010) analyze the process of Internet diffusion using a panel of 214 countries during the period 1990-2004. The main determinants of Internet diffusion they find are GDP per capita, real cost of Internet and computers per capita. Furthermore, they find evidence for national network effects, measured as the lag of the number of users per capita in a given country.

(8)

significant effect. Contrary to other papers they also find the quality of legal institutions to significantly affect Internet usage.

Table 1 summarizes the different explanatory variables proposed in the literature. It specifies the significant and insignificant explanatory variables for each paper, where X stands for significant and O stands for insignificant. Furthermore the dependent variable and countries included in the model are specified.

Table 1: Overview of determinants of Internet usage in discussed literature

Note: X indicates a significant effect and O indicates non-significant effect on Internet usage *only for OECD countries, **only countries with Internet hosts larger than 50 in 1995

(9)

2. SPATIAL MODELLING

This sector addresses some theory on spatial econometrics which are used to construct the spatial models in section 4. The main understanding of spatial econometric modeling is obtained from Elhorst (2014).

2.1 Spatial models

Spatial econometric models are based on the idea that cross-sectional units interact with each other. Three different types of interaction effects in a spatial model can be distinguished. First are the endogenous interaction effects among the dependent variable (Y), in which the value of the dependent variable for one unit is jointly determined by the dependent variable of another unit. Second are the exogenous interaction effects among the explanatory variables (X), where the dependent variable of a particular unit depends on independent explanatory variables of other units. Third are the interaction effects among the error terms (ε), which occur when omitted determinants of the dependent variable are spatially autocorrelated across units. Furthermore, they occur when unobserved shocks follow a spatial pattern.

The non-spatial regression model is introduced in the form:

𝑌 = 𝛼ɩ𝑁+ 𝑋𝛽 + 𝜀

where Y denotes an N*1 vector of the dependent variables, 𝛼 is the constant term estimator, ɩ𝑁 is a N*1 vector of ones, X denotes an N*K matrix of explanatory variables, β represents a

K*1 vector of fixed but unknown parameters and 𝜀 is the error term.

From (1), the full model with all types of interaction effects takes the following form:

𝑌 = 𝛿𝑊𝑌 + 𝛼ɩ𝑁+ 𝑋𝛽 + 𝑊𝑋𝜃 + 𝑢

where 𝑢 = 𝜆𝑊𝑢 + 𝜀 and W represents a non-negative N*N spatial weights matrix describing the spatial relationship between the units. The W-matrix is further discussed below.

WY denotes the endogenous interaction effects among the dependent variable, WX denotes the exogenous interaction effects among the independent variable and Wu denotes the interaction effects among the error terms. δ is the autoregressive coefficient, θ represent a K*1 vector of fixed but unknown parameters and λ the spatial autocorrelation coefficient.

(1)

(10)

These different types of interaction effects result in seven different spatial models. These models including their interaction effects are summarized in table 21.

Table 2: Different spatial models

Spatial Models Interaction effects among

General nesting spatial model (GNS) Y, X and ɛ Spatial autoregressive combined model (SAC) Y and ɛ

Spatial Durbin model (SDM) Y and X

Spatial Durbin error model (SDEM) X and ɛ

Spatial lag model (SAR) Y

Spatial lag of X model (SLX) X

Spatial error model (SEM) ɛ

It should be noted however that use of the GNS model including the full set of interaction effects is unpopular due to two reasons. In the first place, no general conditions under which the parameters of the GNS model are identified have been provided. Only Lee, Liu and Lee (2010) find them, however only for a specific form of the spatial weights matrix which is unpopular in applied research. Second, the parameters of the GNS model are only weakly identified, which has the effect that the significance levels of all variables go down. Given this, the GNS model is also not dealt with in the empirics of this paper.

Spatial econometric models differ from non-spatial models in the sense that the explanatory variables cause both direct and indirect effects. The direct effect is the effect of a particular explanatory variable of a unit on its own dependent variable. The indirect effect is the effect of a particular explanatory variable of a unit on the dependent variable in other units. In their book LeSage and Pace (2009) point out that interpreting these effects represents a better basis for testing if spatial spillovers exists than only looking at δ, θ and/or λ of one or more spatial regression models.

These direct and indirect effects are derived by first rewriting (2) as:

𝑌 = (𝐼 − 𝛿𝑊)−1_{(𝑋𝛽 + 𝑊𝑋𝜃) + 𝛼ɩ}

𝑁+ 𝑢

1_{The acronyms used in this paper are based on the book by Elhorst (2014)}

(11)

Taking the partial derivatives of the expected values of Y with respect to the kth explanatory variable of X in unit 1 to N then gives:

[𝜕𝐸(𝑌)_𝜕𝑥 1𝑘 ⋯ 𝜕𝐸(𝑌) 𝜕𝑥𝑁𝑘] = (𝐼 − 𝛿𝑊) −1_[ 𝛽_𝑘 𝑤₁₂𝜃_𝑘 · 𝑤_1𝑁𝜃_𝑘 𝑤₂₁𝜃_𝑘 𝛽_𝑘 · 𝑤_2𝑁𝜃_𝑘 · · · · 𝑤𝑁1𝜃𝑘 𝑤𝑁2𝜃𝑘 · 𝛽𝑘 ]

The diagonal element of the last matrix represent a direct effect, whereas the off-diagonal elements represent an indirect effect.

As the different spatial models include different set of interaction effects it follows from (4) that they also have different direct and indirect effects. For example, if both δ=0 and θ=0 no indirect effects occur. Table 3, obtained from Halleck Vega and Elhorst (2015), summarizes the direct and indirect effects for the different models.

Table 3: Direct and indirect effects of different models

Direct effects Indirect effects

OLS/SEM 𝛽𝑘 0 SAR/SAC Diagonal elements of (𝐼 − 𝛿𝑊)−1_𝛽 𝑘 Off-diagonal elements of (𝐼 − 𝛿𝑊)−1_𝛽 𝑘 SLX/SDEM 𝛽𝑘 𝜃𝑘 SDM/GNS Diagonal elements of (𝐼 − 𝛿𝑊)−1_(𝛽 𝑘+ 𝑊𝜃𝑘) Off-diagonal elements of (𝐼 − 𝛿𝑊)−1_(𝛽 𝑘+ 𝑊𝜃𝑘)

Source: Halleck Vega and Elhorst (2015)

In both the OLS model and the SEM model the direct effect of an explanatory variable are equal to the coefficient estimate variable 𝛽_𝑘. As δ=0 and θ=0 in these models, indirect effect are zero. In the SLX and SDEM model the direct effects are also equal to 𝛽_𝑘. These models however do have an indirect effect which equals the coefficient estimate of its spatial lagged value 𝜃𝑘.

For the SAR and SAC models the direct effect of the kth explanatory variable is not equal to 𝛽_𝑘, but 𝛽_𝑘 multiplied with a number equal or greater than one. 𝛽_𝑘 also appears in the indirect effect, having the effect that the ratio between the direct and indirect effect of all explanatory variables is the same. For the SDM and GNS model the direct and indirect effects of the explanatory variables depend on both 𝛽_𝑘 and 𝜃_𝑘. This means that no prior restrictions are imposed on the magnitude of both effects and thus that the ratio between them can differ for different explanatory variables.

(12)

Indirect effects can further be split into local and global effects. Local effects occur if θ≠0 and δ=0. They are called local effects because, as follows from (4), the indirect effects only fall on the spatial units for which the elements of W-matrix are non-zero. Global indirect effects occur when θ=0 and δ≠0, as the effects fall on all units. The idea behind this is that although many elements of the W-matrix can be zero, the elements of (𝐼 − 𝛿𝑊)−1_{are not zero.}

2.2 W-matrix

The next part of this section discusses the spatial weights matrix. A spatial weights matrix W is generally a symmetric N*N matrix which describes the spatial arrangements between units. There are exceptions where asymmetric matrices exist, these are however not included in this paper and therefore not touched upon further. The matrix thus takes the following form:

𝑊 = [

𝑤₁₁ ∙ 𝑤_1𝑁

∙ ∙ ∙

𝑤_𝑁1 ∙ 𝑤_𝑁𝑁]

W is a nonnegative matrix with known constants where the diagonal elements are always set to zero, as no unit can have a spatial relationship with itself. The row elements of a weights matrix display the impact on a particular unit by all other units. Conversely, the column elements of a weights matrix display the impact of a particular unit on all other units.

Usually W is row-normalized, ensuring that all weights are between zero and one. This has the effect that the impact of each unit by all other units is equalized. If W is row-normalized, X may not contain a constant as X and WX would become perfectly multicollinear.

Alternatively, W can also be matrix-normalized. This is done by dividing the elements of W by its the largest characteristic root. The effect is that the largest characteristic root of W has a value of 1, just as in the case W is row-normalized. The advantage of this type of normalization is that the proportions between the elements of W do not change and thereby keep their economic interpretation.

Finally, in order to limit the cross-sectional correlation to a manageable degree it is important that one of the following two conditions holds. The first condition originates from Kelejian and Prucha (1998, 1999) and states that the row and column sums of the matrices W, (𝐼_𝑁− 𝛿𝑊)−1_and_(𝐼

𝑁− 𝜆𝑊)−1 before W is row-normalized should be uniformly bounded in

(13)

states that the row and column sums of W before W is row-normalized should not diverge to infinity at a rate equal to or faster than the rate of the sample size N.

In his paper, Leenders (2002) stresses the vital importance of the chosen specification of W. The usefulness of the entire approach of spatial econometric models hinges upon this matrix as the value and significance level of the interaction parameters depend on it.

The spatial weights matrix W however cannot be estimated. It has to be specified beforehand and preferably follow from the economic theory at hand. This gives difficulties, as different theories imply a different W and often theory does not have much to say about the right specification of W. Therefore, empirical researchers often investigate whether the results are robust to the specification of W. This is done by estimating the same spatial econometric model several times using different spatial weights matrices and investigate if the results are sensitive to the choice of W.

For each spatial model often many different matrices can be specified following economic theory, however there exists four commonly used matrices in applied research. The first is the p-order binary contiguity matrix, where if p = 1 only first-order neighbors are included, if p = 2 only first and second-order neighbors are included, etcetera. Second is the inverse distance matrix, which is specified using the distances between units. Sometimes a cut-off point is used, which specifies the maximum distance between units where they still have an impact on one another. Third is q-nearest neighbor matrix, where q is a positive integer displaying the number of nearest neighbors considered. Finally the block diagonal matrix is commonly used, where each block represents a group of spatial units that interact with each other but not with units in other groups.

(14)

two non-geographic measures of connectivity, being trade and common dyadic membership, in various spatial analyses.

This paper uses geographical as well as non-geographical matrices to test the hypotheses. Based on the information discussed in this and previous sections, the next sections introduces the data and models used in this paper.

3.

DATA

This section introduces the panel data set which is used in this paper. The variables included in the dataset are based on the literature discussed above. Data of 167 countries over the period 2000-2009 is used. Because spatial models require a complete balanced dataset these countries are chosen based on the availability of data. Appendix A reports a list of all the countries included in the dataset. All large as well as semi-large economies from all regions in the world are included.

3.1 Dependent variable

The dependent variable in this paper is the percentage of Internet users of total population. Data on percentage of individuals using the Internet per country is taken from the ITU, which is the UN specialized agency for ICTs. The data of the ITU are collected from an annual questionnaire sent to official economy contacts, usually the regulatory authority or the ministry in charge of telecommunication and ICT, and from reports provided by telecommunication ministries, regulators and operators.

(15)

Figure 3: Average Internet usage in percentage of total population for the period 2000-2009

To test for stationairity of panel data series several serveral unit root tests are available. The size of the sample as well as the assumption of a common autoregressive parameter determine which test is most appropriate in a given situation.

For this paper tests developed by Im, Pesaran and Shin (2003) are used. In contrast to various other panel unit root tests these tests allow the autoregressive parameter to be panel specific. Given the great diversity of countries in the sample the assumption that all panels share the same autoregressive parameter might be to simplifying. Furthermore, these tests fit best when having a sample with a relative large set of panels in comparison with time periods, as is the case with the sample in this paper.

Im–Pesaran–Shin (IPS) tests have as null hypothesis that all the panels contain a unit root against the alternative hypothesis that some panels are stationary. The test, including a constant and trend, clearly shows that the panel series of Internet usage is non-stationary.

To deal with the non-stationarity of the series first differences are taken. Figure 4 shows the first differnces of Internet usage per year as average of all countries. The unit root test including a constant rejects the null hypothesis of all panels contaning a unit root at the 1% level.

Figure 4: Average yearly growth in Internet usage in percentage of total population for the period 2000-2008

(16)

3.2 Explanatory variables

From table 1 the major role of income differences in explaining Internet usage is evident. All papers discussed find a significant positive result for the level of income. Data on income is widely available for many countries over a large period of time. For this paper GDP per capita in USD is used, which is collected from the World Bank database. GDP per capita is in constant 2005 USD.

Figure 5 shows the level of GDP per capita per year as average of all countries. As with the level of Internet usage here also a clear upward sloping trend can be seen. On average the GDP per capita was $9218.02 in 2000 and rose to $13086.55 in 2009. The country with the highest GDP per capita was in both cases Luxembourg, where it increased from $72865.06 in 2000 to $79002.74 in 2009. The lowest GDP per capita grew on a much slower pace, being $137.50 for Ethopia in 2000 and $150.22 for Burundi in 2009.

Figure 5: Average GDP per Capita in constant 2005 USD for the period 2000-2009

Source: Data from the World Bank

To measure the relative effect of income on growth in Internet usage, log-levels of GDP per capita are taken. The IPS unit root test including a constant and a trend does not rejects the null hypothesis of all panels contaning a unit root. GDP per capita can thus be regarded as a non-stationairy variable. To deal with the non-stationairity the first differences of the variable are taken. Figure 6 shows the average of the first differences for the sample period. IPS unit root test including a constant rejects the null hypothesis at the 5% level.

(17)

Figure 6: Average yearly growth in log GDP per Capita in constant 2005 USD for the period 2000-2008

Source: Data from the World Bank

However, including first differences in the model instead of levels of GDP per capita is not without consequences. When using first differences the effect of growth in income on growth in Internet usage is measured, whereas using levels measures the effect of a higher level of income on the level of Internet usage. From a theoretical point it is very well possible that the level of income has a significant effect on Internet usage, whereas income growth does not. Internet could for example be regarded as a non-essential good which is only used by people having an income above a certain threshold. If a country’s average income is well below this threshold, a high increase in income might have no or only a small effect on Internet usage. Conversely, growth in Internet usage in a country with an average income level above this threshold might be affected even by a limited growth in income.

The literature discussed in this paper finds a significant effect for income level, not income growth. From these papers it is however unclear if the non-stationairity of GDP per capita is being dealt with. The results in these studies can therefore also be the result of a spurious relationship.

One alternative for addressing the different levels in income of the countries is to use a different time-varying measure of level of income. Examples of such variables are GNI per capita and the Human Development Index. However, these variables are greatly determined by or move parallel to GDP. Unit-root tests for these variables indeed indicate that they are also non-stationary.

A second alternative is to include the base-year level of GDP per capita as variable in the model. This variable does not suffer from non-stationairity as it is a constant. With panel data however, including a constant variable is only possible under random effects models. As discussed in the next section, the use of random effects imposes restrictions which are

(18)

unrealistic for this data. Consequently the models in this paper only include growth in GDP per capita and not in levels.

Given the importance of income level found by all other papers it is useful to assess if the exclusion of income level has major implications for the obtained results. In section 6.3 the robustness of the obtained results will therefore be tested by including a variable measuring income level. Because the results of these models can be based on a spurious relationship they are only used to cross-check the results found in the paper. These results support the robustness of the results obtained by the other models in this paper.

Next to stationairity, GDP per capita also faces the problem of possible endogeneity. As noted in the introduction the importance of Internet in many economies is growing. Therefore it is possible that Internet usage in a country partly determines GDP per capita. This endogeneity issue is a limitation of this paper. However, for the sample period of this paper the direct influence of Internet usage growth on GDP per capita growth can be assumed to be limited, therefore the possible endogeneity issue can be considered to not have a great influence on the results obtained in this paper.

Besides income, a fast majority of the studies point out the importance of ICT infrastructure in explaining Internet diffusion. ICT infrastructure is the physical hardware used to interconnect computers and users.

ICT infrastructure is often measured as computers and fixed telephone subscriptions per capita. The use of the number of personal computers as explanatory variable can however be problematic, as it is difficult to preserve that this variable is truly exogenous. When determining Internet usage in the past it can be argued that the number of personal computers in a country was not or hardly determined by Internet usage, as the main purpose of those computers was not Internet usage. However, in the light of the rapid increase in importance of Internet since the beginning of this millennium, it is likely that the availability of Internet determines the decision to acquire a personal computer, thereby reversing the causality.

(19)

does not have to indicate a decrease in ICT infrastructure, as it is very well possible that people substitute their fixed telephone lines for mobile cellular.

Alternatively, this paper uses mobile cellular subscriptions as indicator for ICT infrastructure. The use of these devises requires a comprehensive technological infrastructure. Furthermore, in the period covered by this paper the use of Internet on the mobile cellular was not yet widespread, dismissing reverse causality.

Data on mobile cellular subscriptions as percentage of total population is also taken from the ITU. As figure 7 shows the level of mobile cellular usage as percentage of total population as average of all countries shows a clear upward trend. The average number of subscribers of total population was 16% in 2000 and rose to 83% in 2009. This rise occurred in all the countries in the world. The highest amount of subscriptions per capita was both in 2000 and in 2009 in Hong-Kong, with respectively 80% and 180% of total population, implying more than one mobile cellular subscriptions per person on average in 2009. The lowest amount of subscriptions in both 2000 and in 2009 stayed far behind, being less than 1% for Iraq in 2000 and 5% for Ethiopia in 2009.

Figure 7: Average mobile cellular subscriptions as percentage of total population for the period 2000-2009

The IPS unit root test including a constant and a trend does not reject the null hypothesis of all panels contaning a unit root. To deal with non-stationairity of the panel series again first differences are taken. Figure 8 shows the first differnces of mobile cellular as percentage of total population per year as average of all countries. The unit root test rejects the null hypothesis of all panels contaning a unit root at the 1% level.

(20)

Figure 8: Average yearly growth in mobile cellular subscriptions as percentage of total population for the period 2000-2008

In line with standard consumer demand theory, several authors use the cost of Internet access as a determinant of Internet usage in a country. However, data on the cost of Internet is very limited, especially for non-developing countries. Therefore its effect on Internet usage has not yet been tested robustly and neither is it touched upon in this paper.

Education attainment of the population has also been conjectured as playing a critical role in some papers. In this paper, education attainment is measured by mean years of schooling of adults. It is calculated as the average number of years of education received by people aged 25 and older, converted from education attainment levels using official durations of each level. The data is collected from the United Nations Development Program (UNDP).

Figure 9 plots the movement of mean years of schooling of adults as average of all countries. The figure shows an increase over the years 2000-2009, however this increase is only small. The average mean years of schooling grew with less than 0.8 years over the whole period. Differences in mean years of schooling between countries however are substantial, ranging from 1.10 for Niger to 12.70 for the United States in 2000 and 1.30 for Burkina Faso to 12.90 for the United States in 2009.

Figure 9: Average mean years of schooling for the period 2000-2009

(21)

Figure 10 plots the averages of first differences of mean years of schooling. The IPS unit root test for stationairity including a constant rejects the null hypothesis of all panels contaning a unit root at the 1% level. As with GDP per capita, log-levels are taken in order to measure the effect of relative growth on Internet usage.

Figure 10: Average yearly growth in log mean years of schooling for the period 2000-2008

Source: Data from the UNDP

Also openness of a country is identified by several papers as having a significant effect on Internet usage in a country. Following Beilock at al. (2003), data on openness is taken from the Freedom House’s index. This index is based on a 14-item Civil Liberties Checklist covering freedom of expression and belief, freedom of association and organizational rights and personal autonomy and economic rights. The score of a country ranges for 0 to 100, where a score lower than 31 refers to “Free” and a score higher than 50 to “Not Free”.

Figure 11 shows the Freedom House’s index per year as average of all countries. Between 2000 and 2001 the average index decreased, but afterwards the figure shows a clear upward trend. In absolute value this upward trend is however limited, increasing from 46.28 to 48.63. Many individual countries however did experience great changes over the period 2000-2009.

(22)

Figure 11: Average Freedom House index for the period 2000-2009

Source: Data from the Freedom House Index

Figure 12 shows the average of the first differences. No clear trend can be detected. The IPS unit root test including a constant rejects the null hypothesis of all panels contaning unit root at the 1% level.

Figure 12: Average yearly growth in Freedom House index for the period 2000-2008

Source: Data from the Freedom House Index

Finally, table 1 identifies several other significant variables. However only very limited amount of papers find a significant effect for these variables. Following Andrés et al. (2010), the inclusion of country effects in the model should be able to capture the cross-country differences explained by these variables.

4. MODEL

This section introduces the models which are used in this paper. First the base-model is discussed, thereafter the different spatial models are discussed.

4.1 Base model

Because the model is based on panel data a choice needs to be made between fixed and random effects. The advantage of fixed effects is that it allows for arbitrary correlation

(23)

between the error term and the explanatory variables, whereas random effects require that the explanatory variables are strictly exogenous and uncorrelated with the individual specific effect. On the other hand, when using fixed effects key explanatory variables which are constant over time are eliminated.

The restriction imposed by random effects is likely to be too restrictive, as it is unrealistic to assume that the omitted heterogeneity is uncorrelated with the repressors. Considering this, using fixed effects for the panel data is preferred. Besides country effects the model also includes time-period effects. Omitted effects that are common across all countries that occurred during the period 2000-2009 are therefore controlled for. The base model is as follows:

∆𝐼𝑈𝑖,𝑡 = 𝛽1∆𝑙𝑛𝑌𝑖,𝑡+ 𝛽2∆𝑇𝐼𝑖,𝑡+ 𝛽3∆𝑙𝑛𝐸𝑖,𝑡+ 𝛽4∆𝐹𝑖,𝑡+ 𝜇𝑖 + 𝜏𝑡+ 𝜀𝑖,𝑡

where the subscript i denotes countries (i=1,…,167) and t denotes time (t=1,…,9).

IU stands for Internet usage, Y for GDP per capita, TI for ICT infrastructure, E for the education attainment and F for degree of openness. As noted Y and E are taken in logs to capture the effects of a percentage growth of these variables on the growth of Internet usage. Finally, the model includes the country fixed effects, 𝜇𝑖, and time-period fixed effects, 𝜏𝑡.

One would expect 𝛽₁ to be positive, since a higher income level is naturally associated with a higher use of technology and a higher possibility of purchasing services and goods for which Internet is used. A growth in income thus provides more opportunities for Internet usage, and thereby has a positive effect on the growth rate.

Also the measurement of ICT infrastructure is expected to have a positive effect on Internet usage, as an increase of the physical hardware used to interconnect computers and users is expected to increase efficient Internet usage in a country. The coefficients 𝛽₂ is thus expected to be positive.

The relationship between education attainment and Internet usage growth is also expected to be positive. Many reasons support the positive relationship. The two most apparent are that literacy is required as the world wide web and email is almost entirely text based and that some level of education is required to ensure a person is actually able to use a computer. Furthermore, academic institutions play an essential role in adopting new technologies like the Internet, thus strengthening the positive relationship.

(24)

Finally, an increase in the openness of a country is expected to have a positive effect on Internet usage. The motive behind it is that Internet facilitates access to very hard to control quantities of information. Closed countries try to limit the spread of this information, whereas open countries promote this spread. As an increase in openness is measured by a decrease in the Freedom House index, this implies that 𝛽₄ is expected to have a negative sign.

4.2 Spatial Models

To test the hypotheses five different spatial model are considered in this paper; SEM model, SAR model, SLX model, SDM model and the SDEM model. The model including all spatial interaction effects is as follows:

∆𝐼𝑈_𝑖,𝑡 = 𝛿𝑊∆𝐼𝑈_𝑖,𝑡+𝛽₁∆𝑙𝑛𝑌_𝑖,𝑡+ 𝛽₂∆𝑇𝐼_𝑖,𝑡+ 𝛽₃∆𝑙𝑛𝐸_𝑖,𝑡+ 𝛽₄∆𝐹_𝑖,𝑡+ 𝜃₁𝑊∆𝑙𝑛𝑌_𝑖,𝑡+ 𝜃₂𝑊∆𝑇𝐼_𝑖,𝑡+ 𝜃₃𝑊∆𝑙𝑛𝐸_𝑖,𝑡+ 𝜃₄𝑊∆𝐹_𝑖,𝑡 + 𝜇_𝑖+ 𝜏_𝑡+ 𝑢_𝑖,𝑡

where 𝑢_𝑖,𝑡 = 𝜆𝑊𝑢_𝑖,𝑡+ 𝜀_𝑖,𝑡. From the model with the complete set of interaction effects each of the spatial models follow.

To test the hypotheses proposed in the introduction four different W-matrices are used. One matrix is based on geographical factors, whereas the other three matrices are based on non-geographical factors. The non-geographical matrix is the first-order binary contiguity matrix, assigning a one to an element if the country is a first-order neighbor and a zero otherwise. The non-geographical matrices are based on economic activity between countries. All W-matrices are row-normalized so that the effect of the impact of each unit by all other units is equalized.

The first-order binary contiguity W-matrix satisfies the required necessary conditions discussed in section 2. As no country is a neighbor of more than a certain number of countries, the first condition is satisfied.

Regarding the first-order binary contiguity W-matrix one would expect 𝛿 to be positive. This positive value is supported by the diffusion effect often associated with new technologies. The familiarity with and adoption of Internet in a country will increase if neighboring countries increase their Internet usage. Furthermore, the possibilities of Internet usage increase with number of users. A growth of Internet usage in a neighboring country is therefore likely to increase the possibilities of Internet, thereby increasing the demand for Internet.

(25)

Also 𝜃₁, 𝜃₂ and 𝜃₃ are expected to be positive for this W-matrix. As noted above, a higher income level in a country is naturally associated with a higher use of technology and a higher possibility of purchasing services and goods via the Internet. As economic activities increasingly do not stop at borders, a growth in the level of income in a neighboring country opens up possibilities for Internet usage in the focal country.

An increase in ICT infrastructure increases the efficient usage of Internet. The increased efficiency is likely to open up possibilities for Internet usage in neighboring countries. The same line of reasoning applies to the level of education. A higher level of education in a country increases its possibilities of exploiting Internet activity in neighboring countries, thereby increasing the demand there.

Lastly, regarding the first-order binary contiguity W-matrix, 𝜃4 is expected to have a negative

sign. If, due to an increase of openness, people in a country are more freely allowed to use Internet for a large range of activities, the possibilities of Internet usage in a neighboring country also increases as information can be more easily shared.

For the non-geographical matrices trade flows are taken as measurement for the economic activity between countries. Data on trade flows are taken from the COMTRADE database of the United Nations Statistical Division, cleaned by the BACI team using their own methodology of harmonization2. Net total values in USD of the trade flows are used.

The first non-geographical W-matrix assigns an one to an element if it was the main export or main import partner of the country in the period 2000-2009, and a zero otherwise. A country is a main export partner of another country if in the period 2000-2009 it was most often the largest export destination of that other country. Similar, it is a main import partner if it was most often the largest import origin in the period.

For 89 countries in the sample both the main import partner and the main export partner is the same country. The other 78 countries have a different main import partner than main export partner. The main trading partners are far from equally divided between the countries. Only 43 of the 167 countries are a main import or export partner of another country.

2

(26)

As figure 13 shows also the impact of these 43 countries is very skewed. Of these 43 countries the great majority of the countries is the main import or export partner of only a few countries. Contrary, 60% of the times one of the top-five countries USA, Germany, China, Russia and France is the main import or export partner of a country.

Figure 13: Number of times main trading partner of other country in the period 2000-2009

Source: Data from the United Nations Statistical Division

In this paper the choice has been made to only assign an one to the W-matrix elements of the main import and main export partner. As with the p-order binary contiguity matrix, basing the W-matrix on higher order trading partners is also possible. However, the results discussed in section 6 show no improvement of the non-spatial model for the models based on the main import and main export partner W-matrix. Including a W-matrix based on higher order trading partners is therefore also not likely to produce significant improvements to the non-spatial model and will not be considered in this paper.

The second non-geographical W-matrix is based on export flows of all countries in the base-year 2000. Each element w_ij in the W-matrix shows which percentage of the total exports of country i are exported to country j. Due to unavailability of data, Botswana, Lesotho, Luxemburg, Namibia and Swaziland are excluded from the sample. Also, because not all countries in the world are included in the sample not all the rows sum up to 100%. However, as all major countries are included in the sample only small percentages are excluded. Furthermore, due to row-normalization the effect of the impact of each unit by all other units is equalized.

Contrary to the W-matrix based on main trading partners this W-matrix takes in consideration the relative economic activity between countries. However, as is the case with the W-matrix

78 46 31 24 22 ₁₈ 13 12 8 8 6 6 6 4 4 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 U SA G ER CHN RU S FR A

ZAF JPN ITA _BRA _GBR _AUS _IND _SGP _SAU _SPA BEL _POR _THA _TUR ARE CA

N H RV G RC KOR MY S MLI NLD _NG A SE N

CHE CIV ETH _GTM IND KEN ZLN PAK SRB _SWE _TGO _UGA UK

R

VE

(27)

based on main trading partners, there are a few large countries whose impact on all countries is great, whereas the impact of all other countries is limited.

The third geographical W-matrix is based on existing regional trade blocs. Each element w_ij in the W-matrix is assigned a one if country i and j are in a trade bloc and a zero if not. Appendix B lists the thirteen different regional trade blocs considered. As trade blocs reduce or eliminate barriers to trade, economic activity between participating countries is likely to flourish. Furthermore, countries within a trade bloc often have much in common in the area of culture and institutions. Worldwide differences in these areas are also present on Internet. In contrast to the other two non-geographical matrices this W-matrix does not suffer from the condition that the impact of few large countries is great.

The non-geographical W-matrices all satisfy the first condition of the necessary conditions discussed in section 2. For the first matrix, a country by definition has no more than one main export partner and no more than one main import partner. In the case of the second W-matrix, as percentages of the total export flows are taken, the row sum is never more than 100% and therefore finite. Finally for the third W-matrix condition one is satisfied as all trading blocs only hold a limited amount of members.

For the non-geographical W-matrices, 𝛿 is also expected to be positive. If economic activity between country j and country i is high, a higher level of Internet usage in country j is likely to result in a higher level of Internet usage in country i. If a certain country uses Internet for a great deal of its economic activities, a country having large economic activity with that country is required to use Internet for certain procedures. Furthermore, increasing the usage of Internet increases the efficiency of trading between the countries.

𝜃1 is also expected to have a positive sign for this group of W-matrices. A higher level of

income is naturally associated with a higher use of technology and therefore a higher level of Internet usage. Thus if there is a high degree of economic activity between country i and j, a high level of income in one of the countries is likely to increase the possibilities of Internet usage for the other country. Thereby thus having a positive effect on Internet usage.

(28)

As noted earlier, a higher level of education attainment in a country increases the capabilities of the people in that country to use the Internet. This is likely to increase the demand for Internet based activities in the other countries, resulting in a positive sign for 𝜃₃.

Finally, similar to the models including geographical W-matrices, 𝜃₄ is expected to have a negative sign. An decrease in restrictions on openness in a country is likely to result in an increase in Internet usage in a countries with whom they have a high degree of economic activities.

4.3 Model selection

Besides the sign and significance of the spatial effects, it is also of interest to see which spatial model fits the data best. This depends on the chosen W-matrix as well as the type of interaction effects included. Theory on how to compare models including different W-matrices is addressed in the next section.

Turning to the type of interaction effects included, the distinction can be made between local and global effects. As shown above, local indirect effects occur if 𝜃𝑖 ≠0 and global indirect

effects occur if 𝛿 ≠0. As discussed in section 2, global effects differ from local effects in the sense that effects fall on all units and not only those for which the elements of W-matrix are non-zero.

Of the models considered in this paper the SEM model does not imply indirect effects. The SAR model does imply global indirect effects, however has the limitation that the ratio between the direct and indirect effect of all explanatory variables is the same. The SLX model and SDEM model are local spillover specifications, whereas the SDM model is a global spillover specification.

(29)

increase of usage in companies operating in totally different regions does not. Also, ICT technology increases are more likely to only spillover to countries close by than globally.

These among more examples support local spillover effects for the model. However, also a case can be made for global effects. The use of Internet itself is not affected by distance. An increase in the usage in one country can therefore have an impact on usage in all countries around the world, independent of distance.

Furthermore, the existence of global effects is supported by research on the diffusion of innovations. Rogers (1986), considered the originator of this field of research, identifies a critical mass of adopters and regular and frequent usage necessary to ensure the success of the adoption of interactive communications innovations like Internet. Due to network effects, the possibilities and effectiveness of Internet usage increases with the amount of users. A change in one of the variables in country i increases the possibilities of Internet usage in country j, which then increases the possibilities in country k and so on, impacting countries all around the world and setting a process in motion which can lead to a new steady-state equilibrium. This points to the inclusion of a spatial lag of the depend variable in the model. Contrary to the SDEM model, the SDM model includes this spatial lag.

Moreover, the choice for global indirect effects is motivated by an econometric-theoretical argument provided by LeSage and Pace (2009). They show that the SDM model produces inefficient but unbiased parameters under the following three circumstances: (1) a potential important variable is omitted from the model, (2) this variable is likely to be correlated with the explanatory variable and (3) the disturbance process is likely to be spatially dependent. This in contrast to SLX, SEM and SDEM models, which produce biased parameters. Concerning this model, both a variable measuring cost of Internet and a variable measuring income level differences can be regarded as a potential important variables omitted from the model.

(30)

5. METHODOLOGY

5.1 Estimation method

The base model (7) is estimated by OLS including fixed effects. For the SLX model this is also possible as it includes only exogenous interaction effects. However, the other spatial models cannot be estimated by OLS. In this section the maximum likelihood (ML) estimator is derived for both the spatial model including endogenous interaction effects, SAR model, and including interaction effects among the error term, SEM model.

The estimation by OLS ignores the endogeneity of WY, which biases the estimation. To account for this endogeneity the spatial models extended to include a spatial lag in this paper is estimated by ML as proposed by Anselin (1988).

The SAR model can be specified as

∆𝐼𝑈𝑖,𝑡 = 𝛿 ∑𝑁𝑗=1𝑤𝑖,𝑗∆𝐼𝑈𝑖,𝑡+ 𝑥𝑖,𝑡𝛽 + 𝜇𝑖+ 𝜏𝑡+ 𝜀𝑖,𝑡

where 𝑤𝑖,𝑗 is an element of the spatial weights matrix W and for convenience all explanatory

variables are pooled into 𝑥𝑖,𝑡𝛽. The log-likelihood function can then be written as

𝐿𝑜𝑔𝐿 = −𝑁𝑇₂ 𝑙𝑜𝑔(2𝜋𝜎2_{) + 𝑇𝑙𝑜𝑔|𝐼}

𝑁− 𝛿𝑊 | −_2𝜎12∑𝑖=1𝑁 ∑𝑇𝑡=1(∆𝐼𝑈𝑖,𝑡−𝛿 ∑𝑁𝑗=1𝑤𝑖,𝑗∆𝐼𝑈𝑖,𝑡−𝑥𝑖,𝑡𝛽 − 𝜇𝑖− 𝜏𝑡)2

The term 𝑇𝑙𝑜𝑔|𝐼_𝑁− 𝛿𝑊 | represents the Jacobian term. This term is ignored in the OLS estimator and takes into account the endogeneity of WY.

Anselin and Hudank (1992) have spelled out how a spatial model extended to include a spatial error term can be estimated by ML. In line with the specifications of the SAR model, the SEM model can be specified as

∆𝐼𝑈𝑖,𝑡 = 𝑥𝑖,𝑡𝛽 + 𝜇𝑖 + 𝜏𝑡+ 𝑢𝑖,𝑡 where 𝑢𝑖,𝑡 = 𝜆 ∑𝑁𝑗=1𝑤𝑖,𝑗𝑢𝑖,𝑡+ 𝜀𝑖,𝑡

The log-likelihood function can then be written as

𝐿𝑜𝑔𝐿 =

−𝑁𝑇₂ 𝑙𝑜𝑔(2𝜋𝜎2_{) + 𝑇𝑙𝑜𝑔|𝐼}

𝑁− 𝛿𝑊 | −_2𝜎12∑𝑁𝑖=1∑𝑇𝑡=1(∆𝐼𝑈𝑖,𝑡−𝜆 ∑𝑁𝑗=1𝑤𝑖,𝑗∆𝐼𝑈𝑖,𝑡−(𝑥𝑖,𝑡− 𝜆 ∑𝑁𝑗=1𝑤𝑖,𝑗𝑥𝑖,𝑡)𝛽)2

From (9) and (11) the log-likelihood functions of the SDM and SDEM model follow. The section now turns to the methodology used to compare the performance of the different models.

(8)

(10)

(31)

5.2 Model testing

First the panel model including fixed effects is estimated. The model is estimated with country fixed effects only, period fixed effects only and with both country and time-period fixed effects. A Likelihood-ratio (LR) test is then performed to see whether the country fixed effects as well as the time-period fixed effects are jointly significant. A LR-test compares the goodness of fit of two models, comparing how much more likely the data are under one model than under the other. The higher the value of the likelihood function, the better the set of parameters estimates fits the data set. The test statistic is compared employing a Chi-squared distribution with certain degrees of freedom.

Second the spatial models are estimated. Using the estimations of the SAR and the SEM model it is tested if the spatial extensions are significant. To test if the extension of a spatial lag is significant a t-test with the null-hypothesis that 𝛿=0 is tested. The significance of the extension of a spatial error is tested by the null-hypothesis that 𝜆=0. Also the SLX model is considered. The significance of the spatial extension of the SLX model is tested by the null-hypothesis that all 𝜃_𝑖=0, based on an F-test.

Next the models with more than one type of spatial interaction effect are considered. In order to study if including two instead of one interaction effect leads to a better fit the SDM model can be tested against the SAR, SEM and SLX model. This is tested by comparing the likelihood of the models to assess their fit by performing a LR-test. The null hypothesis states that removing the spatial extension from the model does not substantially harm the fit of that model.

If the SDM model is the result of the test procedures, the performance of the model are compared with the SDEM model. As tests for significant differences between log-likelihood values require the models to be nested they cannot be used when comparing the SDM and SDEM model. Therefore the Bayesian perspective on model comparison taken from LeSage (2014) is used, as this approach does not require nested models.

(32)

additional test to see if more than one type of spatial interaction effect should be included in the model.

5.3 Comparing W matrices

Finally, models which use different spatial W-matrices are compared to see which matrix fits the data best. Again tests for significant differences between log-likelihood function values cannot be used as models with different W-matrices are non-nested. Therefore, following LeSage and Page (2009), Bayesian posterior model probabilities are used to compare the models.

Before the estimation of the models each probability is set equal to 1/S, where S stands for the number of different models, so that a priori each model is made equally likely. For each W-matrix the SDEM model, which the results show fits the data best, is then estimated using Bayesian methods, where after posterior probabilities based on the data and the estimation results are computed. The posterior probabilities sum up to unity, where the model with the highest probability is said to fit the data best.

6.

RESULTS

This section discusses the results of the estimations. First the results of the non-spatial model are discussed. Thereafter the discussion turns to the different spatial models.

6.1 Results base model

(33)

Autocorrelation and heteroskedastisity in panel data models biases the standard errors and causes the results to be less efficient. To test for autocorrelation a test derived by Wooldridge (2002) is performed, which can be applied under general conditions. The test has a null hypothesis of no autocorrelation. The F-test results give a value of 3.903 for 1 and 166 degrees of freedom, which is close to but just above the 5% significance level of 3.898. The null hypothesis of no autocorrelation is thus rejected at the 5% level.

To detect if the panel data suffers from heteroskedastisity the residuals of the model are plotted against the dependent variable, which is shown in figure 14. No clear sign of heteroskedastisity is detected in this figure.

Figure 14: Plot of residuals fixed effects models against dependent variable

Table 4: Results panel data models without spatial interaction effects

Model 1 Model 2 Model 3 Model 4

Determinants Country fixed effects

Time-period fixed effects

Country and time-period fixed effects

Country and time-period fixed effects and robust se D.Y -0.0463 -1.7568 1.4874 1.4874 (1.756) (1.798) (1.835) (1.488) D.TI 0.0449*** 0.0869*** 0.0472*** 0.0472*** (0.011) (0.012) (0.012) (0.013) D.E 0.09314 -0.8962 -0.1685 -0.1685 (0.831) (0.953) (0.828) (0.148) D.F -0.0181 -0.0202 -0.0273 -0.0273** (0.024) (0.026) (0.023) (0.014) N 1503 1503 1503 1503 R^2 0.126 0.048 0.104 0.104 F 4.41 6.21 3.65 3.65 Log-likelihood -3614.77 -3918.53 -3600.22 -3600.22

Note: ***, **, * indicate significantly different from zero at the 0.01, 0.05, and 0.10 levels, respectively Standard errors are in parentheses

-15 -10 -5 0 5 10 15 20 25 30 -10 -5 0 5 10 15 20 25 30 35 R ES IDUA LS

(34)

To address the issue of biased standard errors, robust standard errors are used to estimate the panel and spatial models. This affects the standard errors but leave the parameter values unchanged. Model 4 in table 4 reports the estimation results for the panel model with both fixed effects and robust standard errors. The country and time-period fixed effects are not reported in the table as they are not of interest for this paper.

Where all previous papers find a major role for income in explaining Internet usage, this paper does not find a significant role for growth of income. An explanation is that previous papers used income level as independent variable instead of income growth. Provided that the findings of the other papers are not based on spurious results, these results thus indicate that it is not so much growth of income but reaching a certain level of income which affects growth in Internet usage .

As expected, the measurement for ICT infrastructure is highly significant and positive. An one percentage point increase in ICT infrastructure growth increases Internet usage growth by 0.05 percentage points. This seems like a marginal increase, however in the light of the rapid increase of the technological infrastructure this effect has been substantial. As noted in section 3, mobile cellular usage per capita rose from 0.16 to 0.83 on average in the sample period.

Despite strong theoretical support this paper does not find a significant effect for growth in education attainment. An explanation can be the low variation of the panel data. For most countries education attainment only increased marginally in the sample period, making it difficult to capture its effect on Internet usage.

For openness of a country, a decrease of one in the index increases Internet usage growth by 0.03 percentage point and is significant at the 5% level. As a decrease in the index represents more openness this means a higher degree of openness has a positive effect on Internet usage. Because on average the countries in the sample have become less open in the period 2000-2009, this implies this decrease in openness has held back the growth of worldwide Internet usage. On average the increase of the index was however only 2.35. Thus the effect on worldwide Internet usage growth can be regarded as marginal.

(35)

is to capture the significance of spatial components in the model the value of the R-squared is however not of further interest.

6.2 Results spatial models

The focus of the paper now turns to the estimation results of the spatial models. First the results of each model containing a different W-spatial matrix are discussed. These results are compared with the results of the non-spatial model. Conclusions can then be drawn if the spatial extensions significantly improve the model, and if so which of the spatial extensions should be included. Also the direct and indirect effects are discussed. Thereafter the models with different W-spatial matrix are compared by Bayesian posterior model probabilities. From this conclusions can be drawn on which model fits the data best.

Table 5 reports the estimation results of the different spatial models calculated using the first-order binary contiguity W-matrix. For all models the coefficient estimates of growth in ICT infrastructure and openness are statistically significant and the signs are as expected. ICT infrastructure is still highly significant at the 1% level for all models. The coefficient measuring openness losses some of its significance when including spatial interaction effects, however this loss is not great. Growth in income and education attainment remain insignificant. For the SAR and SDM model 𝛿 is positive and significant, whereas for the SEM and SDEM model’s 𝜆 this also applies. The table furthermore shows a clear improvement in the log-likelihood values for the spatial models, indicating an improvement of the model.

As the null-hypotheses that 𝛿=0 and 𝜆=0 are rejected at the 1% level the extension from the non-spatial model to the SAR and SEM model significantly improves the model. For the SLX model the F-statistic of 3.60 rejects the null-hypotheses that all 𝜃_𝑖=0 at the 1% level, indicating that also this extension significantly improves the model.

(36)

Finally the SDM model is compared to the SDEM model to assess if the indirect effects are global or local specifications. The log-likelihood values in table 5 show little difference. As discussed the comparison is done by calculating Bayesian posterior model probabilities of the SLX, SDM and SDEM model. The results are reported in table 6. It reports the log-marginal likelihood values as well as the model probabilities of the three models calculated by Bayesian method.

The log-marginal likelihood values show little difference, however the SDEM model clearly has a higher probability than the SDM model. Moreover, as indicated by the LR-test earlier, both models are preferred over the SLX model. The model comparison thus shows that the SDEM model fits the data best. That is, including exogenous interaction effects and interaction effects among the error term. These results thus do not find support for the theory on diffusion of interactive communications addressed in section 4. Instead it supports the idea of local spillovers affecting Internet usage.

Table 5: Model comparison of estimation results explaining Internet usage growth with first-order binary contiguity W-matrix

Determinants Panel SAR SEM SLX SDM SDEM

D.Y 1.4874 1.1388 1.6770 1.5268 1.6036 1.4307 (1.488) (1.332) (1.358) (1.466) (1.322) (1.349) D.TI 0.0472*** 0.0395*** 0.0344*** 0.0393*** 0.0341*** 0.0391*** (0.013) (0.013) (0.013) (0.013) (0.013) (0.013) D.E -0.1685 -0.1229 -0.1020 -0.1873 -0.1387 -0.1575 (0.148) (0.128) (0.138) (0.145) (0.128) (0.145) D.F -0.0273** -0.0257* -0.0270* -0.0276* -0.0267* -0.0269* (0.014) (0.014) (0.014) (0.014) (0.015) (0.015) WD.IU 0.2851*** 0.2738*** (0.064) (0.064) WD.Y -2.5811 -3.7332 -3.6972 (3.341) (3.029) (3.297) WD.TI 0.0806*** 0.0618*** 0.0726*** (0.026) (0.023) (0.025) WD.E -0.9675 -0.6546 -0.5685 (0.634) (0.627) (0.663) WD.F -0.0013 -0.0084 -0.0026 (0.028) (0.026) (0.028) Wu 0.2815*** 0.2721*** (0.065) (0.064) R² 0.104 0.137 0.088 0.207 0.228 0.212 Log-likelihood -3600.22 -3559.39 -3561.22 -3591.25 -3553.58 -3554.60