Trading flow of Mainland China and Hong Kong: Empirical Evidence from the Gravity Model

(1)

Trading flow of Mainland China and Hong Kong:

Empirical Evidence from the Gravity Model

Master Thesis of International Economics and Business

Faculty of Economics

Rijksuniversiteit Groningen

Landleven 5, Groningen

9747 AD, the Netherlands

Tel: 0031 50 3637098

Fax: 0031 50 3633788

By Wenmin Peng (1349732) July 2006

Instructor: Dr. H. W. A. Dietzenbacher Supervised by Prof Dr A. van Witteloostuijn

Faculty of Economics

(2)

ACKNOWLEDGEMENT

(3)

Trading flow of Mainland China and Hong Kong: Empirical

Evidence from the Gravity Model

Abstract: In this paper, we investigate trade flow between Hong Kong and mainland

China by utilizing gravity model. With empirical data covering 24 provinces or provincial-level region of mainland China from 1990 to 2004, we find that economic size, population and distance are significant in explaining the volume of trade flow between Hong Kong and mainland China. In addition, nine provinces in CEPA are found to benefit from a bigger trade flow with Hong Kong due to their proximity to it.

(4)

CONTENT

1. Introduction……….5

2. Literature Review………6

2.1 The Gravity Model………..6

2.2 Empirical Studies……….7

3. Methodology………....8

3.1 Basic Regression Model………...8

3.2 Data and Measurement……….9

3.3 Econometric Specifications………..9

3.3.1 FE, RE and Pooled Regression………10

3.3.2 Heteroskedasticity………11

3.3.3 Autocorrelation……….13

3.3.4 Collinearity………...15

4. Results and Discussion………...16

5. Conclusion………..18

6. Appendices……….19

(5)

1. INTRODUCTION

Hong Kong has been the biggest trade partner of mainland China and established trade relationship with all provinces and municipalities directly under the central government. Though Hong Kong became part of China again in 1997 as a special autonomous region under the framework of ‘one country two systems’, it is still a independent region with its own customs system and deals with its relationship with mainland provinces on its own. In addition, economies of mainland provinces are at different level and speed and of different size, not to mention they are of different distance from Hong Kong1 from 161 to 4352 kilometers. As a result, trade volume between mainland provinces and Hong Kong varies from province to province. Hence, gravity model can be applied to investigate trade flow between Hong Kong and mainland China.

Mainland China and Hong Kong will embrace even closer economic ties during the Closer Economic Partnership Arrangement (CEPA). In the past two decades, Inter-regional trade has expanded rapidly and Mainland China becomes a more and more important trade partner of Hong Kong. CEPA, the free trade pact within China, was officially launched at the beginning of 2004, covering trade in goods, services and investment to facilitate Hong Kong investment in the mainland. On the other hand, many economists think CEPA will facilitate the sources conformity and development of "Greater Pearl River Delta" (Guangdong, Hong Kong and Macao) resources, and will provides an important platform to accelerate regional economic integration, and provide a powerful engine of radiation for the creation of "9 +2 Pan-Pearl River Delta Economic Cooperation Zone" (that is, Guangdong and its eight neighboring provinces along the upper stream of Pearl River, Fujian, Hainan, Jiangxi, Hunan, Guangxi, Guizhou, Yunnan and Sichuan, plus two Special Administrative Regions, Hong Kong and Macao.

Gravity Model came into being in 1960s, which has then became a frequently used tool in analyzing trade flow between different countries and regions, and it is still widely used nowadays. The main purpose of this paper is to analyze the trade flow between Hong Kong and each mainland provinces with gravity model, and we plan to do an empirical study at aggregate level on trade between each province of mainland China and Hong Kong to examine the factors that affect trade flow between them since the 1990s. The research questions of this paper, then, come as following: Is distance really a significant factor to affect the trade flow between provinces of Mainland and Hong Kong? What’s the role Guangdong province has been or is playing in trade flow with Hong Kong? And CEPA members? We will investigate the impact of these factors on trade flow between mainland provinces and Hong Kong regardless of direction.

In the following section gravity model will be used to analyze the factors that have

1

(6)

influence on the inter-regional trade between Mainland China and Hong Kong. And the trend of trade flow between Mainland China and Hong Kong will be reviewed and hence the hypotheses described. Section III builds up the model and variables will be explained later on. In the fourth section, regression results will be presented and discussed. Finally, conclusion will be made.

2. LITERATURE REVIEW

2.1 The Gravity Model

Tinbergen (1962) and Poyhonen (1963) were the first scholars who introduced gravity model into research on international trade. Gravity model did not come into being from theoretical deduction but from judgment based on empirical experiences: bilateral trade volume increases with economic size of the two parties involved and decreases with the distance between them, which appears similar to Gravity Law in physics. Though the model lacks theoretical support and has received lots of criticism, it has been successfully applied in empirical studies in last 40 years and provided significant explanation to empirics in international trade volume. According to the model, export from country i to country j can be explained with economic size (GDP or GNP), population, geographic distance of the two sides and the basic formula can be as follow,

(1) where Xij denotes nominal value of trade volume from country i to j, Yi and Yj nominal

economic size of country i and j respectively, Pj population of country j, Dij distance

between the two countries, eij residual, and A incorporates all other technical factors.

Besides, there are many factors that can impact trade volume between two countries, such as policy, history and culture. The basic model above can expand to incorporate these factors in the form of dummy variables. For example, if there is a shared language, shared border, colonization history, membership to the same political or economic bloc or any signed treaty to encourage or restrict trade between them, the dummy describing this takes value 1, and 0 otherwise. And in application of gravity model, economic size per capita is usually included instead of absolute size since size per capita can take into account not only population but structure of demand and comparative advantages. Transforming Equation 1, we obtain the following formula,

(2) where P incorporate several dummy variables describing policy, historical and cultural factors. A higher Yi reflects a bigger economic size, leading to a stronger

capability of producing goods for export. Hence, we expect a positive β1; we also

expect a positive β2 since the higher national income of the export destination country,

the higher demand for imported goods; national income per capita also represents economic development and thus capacity to export; and a higher national income per

(7)

capita leads to higher demand for both volume and variety of import. Many studies include income per capita (Huang, D & Huang, Y, 2006), such as the ratio of GDP over population with GDP already in existence, because according to Linder Hypothesis countries with similar levels of income per capita exhibit similar tastes, produce similar but differentiated products and trade more amongst themselves (Roberts, 2004). However, we do not add GDP per capita, as this can create collinearity with absolute economic size, Yi and Yi. Instead, we let population and

economic size coexist in the same equation by following Polak (1996) and Martinez-Zaroso & Nowak-Lehmann (2003) and expect both β3 and β4 to be negative

since a larger population lowers income per capita thus reducing individuals’ demand for import due to a lower disposable income holding everything else constant. We also include distance between two countries as a measurement of trade cost, which limits trade activity. Hence, a negative β5 is expected. In addition, dummies control for trade

incentives should have positive coefficients while trade restrictions negative coefficients.

2.2 Empirical Studies

There are many studies utilizing gravity model to analyze trade flow between China and outer world and most of them focus on international trade flow. Roberts (2004) studies the impact of China-ASAEN free trade area (CAFTA) and concludes that the gravity model exhibited a good fit in explaining trade flows within CAFTA. He also find that the CAFTA economies would have to map out policies and strategies to bring about convergence in their income levels should maximum benefits be expected from the proposed Free Trade Area. With a positive dummy variable attached to intra-APEC trade, Frankel, Stein and Wei (1994) claim that APEC is a naturally regionalized trade zone since there is no artificially designed policy to promote trade among APEC economies. This study has implication on our research in a way that similar to APEC, CEPA in China had not received any practical political support to facilitate trade in it until 2004. Filippini and Molini (2003) modify the gravity model to include a few variables and find that the more is wider the technological gap the less countries are encouraged to trade, besides that signs of normal variables, such as economic size and population, consistent with hypotheses. They also show the very high propensity of newly industrialized East Asian countries (including China and excluding Japan), to exchange manufactured products with EU, Japan and USA from the late of 1970s, while Latin American countries declined. Among the East Asian economies China, not surprisingly, plays a very important role as exporter and as importer too in more recent year. However, there is no study in intra-national trade flow in China as far as we can see.

(8)

capita, and amount of bilateral trade between the two countries.

3. METHODOLOGY

3.1 Basic Regression Model

This paper uses the expanded gravity model (Equation 2) to test trade volume between Hong Kong and mainland provinces. According to research questions of this paper, we need two dummy variables to take into account language and CEPA. Hong Kong, located in Pearl River Delta, shares the local language, Cantonese, with Guangdong province. And from 1980s, most manufacturers started to move into other parts of the delta. Hence, there is a exceptionally strong relationship between Hong Kong and Guangdong province. In addition, Guangdong has long been in a relationship with surrounding eight provinces in terms of transportation, exchange of material, exploitation of resources, scientific research, tourism and so on (Li, 2003). With the central role played by Hong Kong and Guangdong, the eight provinces have been trying to join a closer economic network, called ‘Greater Pearl River Delta’. Taking this into consideration, we develop Equation 2 into the following,

(3)

where TFih is trade volume between province i and Hong Kong. Yit and Yht are

nominal GDP of province i and Hong Kong respectively. Pi and Ph are population of

province i and Hong Kong respectively. Dih measures the shortest highway distance

from the capital city of province i to Hong Kong. GD is a dummy, taking value 1 for Guangdong and 0 for other provinces, and CEPA also a dummy, taking value 1 for the eight provinces in ‘Greater Pearl River Delta’ and 0 for other provinces.

Based on the above model and previous literature review, we make following hypotheses,

H1: Trade Flow between Hong Kong and mainland provinces positively correlate with GDP of mainland provinces (β1 >0).

H2: Trade Flow between Hong Kong and mainland provinces positively correlate with GDP of Hong Kong (β2>0).

H3: Trade Flow between Hong Kong and mainland provinces negatively correlate with population of mainland provinces (β3<0).

H4: Trade Flow between Hong Kong and mainland provinces negatively correlate with population of Hong Kong (β4<0).

H5: Trade Flow between Hong Kong and mainland provinces negatively correlate with distance between mainland provinces and Hong Kong (β5<0).

H6: Trade Flow between Hong Kong and Guangdong provinces is in general higher than that between Hong Kong and other provinces (β6>0).

H7: Trade Flow between Hong Kong and CEPA provinces except Guangdong is in general higher than that between Hong Kong and other provinces (β7>0).

(9)

3.2 Data and Measurement

All data in this study is from various yearbooks compiled by central government of provincial governments, among which value of TFih is from yearbook from 1998 to

2004 of 30 mainland provinces and Yearbook of Foreign economy and Trade (1998-2003). Our data includes 24 mainland provinces2 in 15 years (1990-2004) but with only 276 instead of 360 observations since there are missing data in some provinces. Yi and Pi are from dataset of China Economic Information Network

(www.cei.gov.cn) while Yh and Ph from Census and Statistics Department of Hong

Kong government (www.info.gov.hk/censtatd). And Dih, measuring distance between

Hong Kong and mainland provinces, is from Traveling Overall China (www.go2map.com). However, since data for some provinces are not available in those yearbooks and other sources we have to conduct regression estimation with unbalanced data. Jilin, Hubei, Guizhou, Tibet, Henan, Inner Mongolia and Qinghai are excluded from our sample due to insufficiency in data. Measurements of variables in the regression model are described in Table 1.

3.3 Econometric Specification

For most gravity studies testing trade flow, ordinary least square is applied for cross-sectional data. However, some scholars have recently started to utilize panel analysis (Lin and Xia, 2004), which benefits from an increased number of observations and enables an investigation of variation in cross-sections. Hence, this

TABLE 1

Variables and Measurements

Dependent variables

TFiht The sum of trade flow between province i and Hong Kong in both directions in year t (in million US dollars)

Independent variables

Yit Nominal GDP of province i in year t (in million US dollars)

Yht Nominal GDP of Hong Kong in year t (in million US dollars)

Pit Population of province i in year t (in thousand)

Pht Population of Hong Kong in year t (in thousand)

Dih the shortest highway distance from the capital city of province i to Hong Kong (in km)

GD =1 for Guangdong and =0 otherwise

CEPA =1 for members in CEPA except Guangdong and =0 otherwise

Note: GDP of mainland provinces and Hong Kong in data sources are measured in RMB and Hong Kong dollars respectively. We use exchange rate at the end of each sample year from the website of IMF to convert them in to US dollars.

Gross products of a province is recorded in websites of central and local governments of China as provincial GDP. Hence, we use the term GDP instead of gross regional products (GRP).

2

(10)

paper attempts to conduct a panel study. There are mainly three different models for panel analysis, pooled regression model, fixed effects (FE) model, and random effects (RE) model, among which there might be derivative models due to problems of heteroskedasticity and autocorrelation since panel data involves both time and cross-sectional dimensions.

3.3.1 FE, RE, and pooled regression Before any test, we assume FE model to be the

best specification for our study for two reasons. Firstly, pooled regression model cannot be the suitable one for this study because 24 provinces, as different characteristics not captured by explanatory variables can have different effects on our regression. Secondly, RE model is not the best candidate either since it views the cross-sections, on which we have data, as a random sample from a larger population. However, the 24 provinces are not random sample drawn from a population but nearly exhaustive and exclusive cross-sections. Hence, the dummy variables of different provinces are treated to be unknown but fixed parameters. We make inferences only about the provinces in our sample. In another word, there is no ‘superior provinces’ for the province dummies in this research to extend to. The three statistical models differ mainly in their assumptions concerning the intercept and error terms. It is assumed that the error term in an equation can be decomposed into two independent elements:

(4)

where ui is time-invariant and accounts for any unobservable regional specific effects

not captured by explanatory variables. The term, vit represents the remaining

disturbance, and varies over cross-sections and time. In a pooled regression model, the ui’s are assumed to take the same value for all cross-sections. Both FE and the RE

models accommodate unobservable heterogeneity. In the former, the ui’s are fixed

parameters to be estimated with dummies for time or cross-sections, while in a RE model, the ui’s are assumed to be random, independently and identically distributed,

i.e. ui~N(0, σ u 2). The RE model is better than the FE model if the unobservable

effects are uncorrelated with regressors, because dummies in a FE model can exhaust degrees of freedom. To decide which model is more appropriate for this study, FE or RE, we follow Liu et al. (2000) to conduct three diagnostic tests to check the validity of the assumptions accompanying the two models. They are likelihood ratio (LR) test for FE model against pooled regression model, Hausman specification (HS) test for FE model against RE model, and Lagrange Multiplier test for RE model against pooled regression model. Rejection of hypotheses in the first two test favours FE model3. Besides, some events happened during the period from 1990 to 2004 can have significant impact on FDI and thus our results, such as return of Hong Kong, accession to WTO as well as Asian financial crisis. For this reason, we also need to decide a specification to control period effects besides testing a best model

3

See a more detailed description of these two tests in Appendix B. it

i it =u +v

(11)

specification for cross-sections.

However, due to the inclusion of variables distance (Dih) and GDP of Hong Kong (Yht),

we are not able to apply FE dummies in our regression since these two variables are constant along time and cross-sections respectively. Period or cross-sectional FE dummies lead to singular matrix in this case. In addition, Hausman test statistics in the following table do not suggest that FE model is better than RE model at 5% significance level with respects to period or cross-sections.

Hence we can only choose between RE and pooled regression model. We need to implement LM test for RE model against pooled regression derived by Breusch and Pagan (1980)4. However, we are not able to carry out this test due to the unbalanced nature of our dataset, which produces an unbalanced residual table. With missing values in the residual table, the two vectors, v’ and v are not able to multiply with the Kronecker product matrix as described in Appendix C. We are also not able to combine cross-sectional RE and period RE due to an unbalanced dataset. Hence, we will run three regressions for Equation 3, RE along time, RE along cross-sections and pooled regression and see if they produce similar estimates.

3.3.2 Heteroskedasticity Since the dataset is organized in the form of cross-sections in

each time point (year in this case), our regression can easily suffer from

TABLE 2

Hausman Test for Cross-sections and Periods

Test summary Chi-Sq. Statistics d.f. Prob Cross-section 3.188 4 0.527

Period 3.770 5 0.583

TABLE 3

LM Heteroskedasticity Test Results for Pooled Regression

Regressors in Step 2 LM test statistics d.f. p-value Four variables together .0351 4 .998

lnYit .000 1 .999

lnYht .000 1 .999

lnPit .000 1 .999

4

(12)

lnPht .000 1 .999

heteroskedasticity. Cross-sectional data invariably involves observations on economic units of different sizes, different provinces with different size of labour force for instance. And frequently, the larger a cross-section is, the more difficult it is to explain the variation in some outcome variables by the variation in a set of explanatory variables because larger cross-sections, such as firms and households, are likely to be more diverse and flexible with respect to the way in which values for the dependent variable are determined (Hill, Griffiths, and Judge 2001). Hence, many studies with cross-sectional or panel data are found to try to curb heteroskedasticity and weighted least squares (WLS) is a model specification most frequently applied. Weights are the share of each plant in total annual industry output in the study done by Aitken and Harrison (1999), and square root of the number of firms in different groups of manufacturing enterprises by Liu (2002). Similarly, we need to assign weights to different provinces to trim heterogeneity of size. Thus, we implement Lagrange Multiplier (LM) test for heteroskedasticity5 (unimelb.edu.au, 2006) and look for cause of this problem then carry out correction measure if hetero is detected. The three tables in previous page surprisingly show that there is no hetero caused by our explanatory variables.

However, we can only conclude that there is no sign of hetero cause by our explanatory variables. It is still a question whether any other size measurement can bring hetero. Hence, we use feasible GLS (FGLS) along cross-sections for pooled regression model and OLS with White consistent covariance estimators for the two RE models since estimated GLS has already been utilized in RE regression, along

TABLE 4

LM Heteroskedasticity Test Results for RE along cross-sections

Regression in Step 2 LM test statistics d.f. p-value Four variables together .213 4 .9

lnYit .000 1 .999

lnYht .000 1 .999

lnPit .000 1 .999

lnPht .000 1 .999

TABLE 5

LM Heteroskedasticity Test Results for RE along periods

Regression in Step 2 LM test statistics d.f. p-value Four variables together .035 4 .98

lnYit .000 1 .999

lnYht .000 1 .999

lnPit .000 1 .999

5

(13)

lnPht .000 1 .999

time or cross-sections6.

3.3.3 Autocorrelation Since we have 24 provinces over a period of 15 years, variables

on the right hand of the two equations may have impact on trade flow for more than one period. Hence, it is possible that the errors are not independent along time dimension, cov(ε_t,ε_s)≠0, which is a violation of the assumption of OLS model that errors are independent and distributed N(0,σ_ε2) for all individuals and in all time periods (Hill, Griffiths & Judge, 2001). To test the presence of autocorrelation we apply Lagrange Multiplier (LM) test. The null hypothesis of LM test is ρ=0 in Equation 5.

(5) Where ρ is a parameter that determines the correlation properties of error term _i ε _it in sector i, and v are uncorrelated random variables with a constant variance. If H_it o

is rejected, autocorrelation poses a threat to validity of our estimation. First, we run Equation 3 in three models, pooled regression, RE along periods and RE along cross-sections. Second, we regress dependent variables on explanatory variables and one year lagged error term obtained from the first step. Then, we find the one year error term is significant in all three specifications (t-value of the error term is 22.400, 10.074 and 20.492, all significant at 1%). Durbin-Watson (DW) and LM test statistics

TABLE 6

Test for Autocorrelation

Model specifications DW LM Pooled regression .903 22.400** RE along cross-sections .893 10.074** Re along periods .597 20.492** Note: * and ** denotes significance at 5% and 1% respectively

TABLE 7

Results of Augmented Dickey-Fuller Test

Variables Test statistics Ln(Xiht_Anhui) -1.745 Ln(Xhit_Anhui) -2.084 Ln(Yit_Anhui) -0.402 Ln(Yht) -5.300** 6

(14)

Ln(Pit_Anhui) -2.494

Ln(Pht) -1.352

are presented in Table 6 for the three model specifications.

Hence, we conclude autocorrelation does threaten the least squares. And the reason could be that all variables varying along time are non-stationary, which in turn produces a period-dependent error term. We use augmented Dickey-fuller test to investigate natural log of Xiht, Xhit, Yit, Yht, Pit, and Pht, since they all vary along time

dimension. Equation 6 illustrates the test process.

(6) The null hypothesis is γ=0, which means the variable y has a unit root and thus is non-stationary. The result we find suggests that the variables for all provinces or Hong Kong except Hong Kong’s GDP are non-stationary. We give an example of every variable in Table 7 above.

Since the main reason for autocorrelation is presence of autocorrelated variables, we decide to check the possibility of utilizing error correction model, which in turn requires that dependent and explanatory variables be cointegrated. If the γ in following equation is significant, then there is cointegration in our model.

(7) The t-values for γ for pooled regression, RE along cross-sections and RE along periods are -5.885, -5.719 and -5.890 respectively, all significant at 1% level, indicating cointegration in all three models. Hence, error correction model is a

TABLE 8

P-value of ρ Estimates from Equation 5

Pooled regression RE along cross-sections RE along periods

(15)

Note: we do not present estimation results further than t-12 since number of observations has already lowered to 19 at t-2.

feasible option to curb autocorrelation. One way to estimate the error correction model is to use least squares to estimate the cointegration relationship yt = β1 + β1xt,

and to then use the lagged residuals as the right hand side variable in the error correction model, estimating it with a second least squares regression. After implementing error correction model, we run Equation 6 to check whether there is still correlation between residuals from different period and construct the following tables to present p-value of the estimates of ρ, slope coefficient between residuals in period t and t-1, t-2 ...

As Table 8 shows, there are only two ρ’s significant at 5% level, the estimates between residual at period t and period t-3 and between t and t-8 for RE model along periods. Hence, error correction model largely eliminates autocorrelation.

3.3.4 Collinearity Another problem that might seriously undermine least squares is

collinearity between variables on the right hand side of an equation. When nearly exact linear dependencies among the explanatory variables exist, some of the variances, standard errors, and covariances of the least squares estimators may be

TABLE 9 Regression Results Pooled regression RE along cross-sections RE along periods ln Yit 2.466** (.044) 2.336** (0.060) 2.465** (0.057) ln Yht 1.602** (.266) 1.806** (0.580) 1.750** (.691) ln Pit -1.813** (,047) -1.699** (0.045) -1.813** (.043) ln Pht -21.680** (.858) -21.672** (2.665) -22.385** (3.081) ln Dih -0.812** (0.066) -0.857** (0.065) -0.854** (.059) GD 1.376** (.160) 1.319** (0.130) 1.280** (.128) CEPA 0.775** (0.064) 0.655** (0.069) 0.740** (.074) Adjusted R-square 0.988 0.952 0.966 Number of observations 249 249 249 DW statistics 2.288 2.231 2.170

Note: numbers in brackets are standard deviation;

(16)

Pooled regression is estimated with FGLS along cross-sections, while both RE regressions OLS with White cross-sectional covariance estimator.

large. In this case, it is likely that the usual t-test will lead to the conclusion that parameter estimates are not significantly different from zero. This outcome occurs despite possibly high R-square or F-values indicating ‘significant’ model as a whole (Hill, Griffiths, and Judge 2001). To check whether our regression suffers from it, we run the three regressions. Results in Table 9 show that collinearity does not pose as a problem for our regression as all explanatory variables seems significant and the whole models have high coefficients of determination.

In summary, to answer research questions with gravity model, we run three regression models, pooled regression with FGLS along cross-sections, RE along cross-sections with White cross-sectional covariance estimator and RE along periods with White cross-sectional covariance estimator. To correct autocorrelation, we apply error correction model as described above.

4. RESULTS AND DISCUSSION

We can find regression results in Table 9 for the three econometric specifications, pooled regression with FGLS along cross-sections, RE across time and RE across cross-sections with White covariance consistent estimator. The coefficients estimates from the three specifications share the same signs and there is no big difference among the magnitude of them, presenting a robust and consistent reflection of power of gravity model in explaining trade flow between inland provinces and Hong Kong. Most importantly, all coefficients are significant at 1% and have expected signs. According to the pooled regression results, one percentage increase in nominal GDP of an inland province is associated with an increase in trade flow into and out of Hong Kong by 2.466%, while one percentage increase in nominal GDP of Hong Kong contribute to a 1.602% increase in trade flow. Hence, hypotheses 1 and 2 are supported. Population of both inland provinces and Hong Kong is negatively correlated with trade flow between them. In particular, one more percentage in population of an inland province and Hong Kong leads to a 1.813% and 21.68% decrease respectively in trade flow between inland provinces and Hong Kong, which is consistent with hypotheses 3 and 4. With economic size controlled, a bigger population can lower average disposable income of consumers. Some scholars find similar results (Lin & Xia, 2004) while some find the contrary (Egger & Pfaffermayr, 2003). The reason that Hong Kong population has a much bigger negative effect than inland provinces might be that population density in Hong Kong is much bigger than any inland province7, which generates a much bigger marginal population effect in lowering average disposable income than inland provinces where population density

7

(17)

is far below. Hypothesis 5 is also supported and the coefficient for distance shows that a province, whose capital city is one more percentage far away from Hong Kong, generally trades 0.812% less with Hong Kong.

There is also strong support for hypotheses 6 and 7. Guangdong generally trades with Hong Kong more than average level of other provinces in our sample by approximately e1.376 = 3.958 million US dollars. And a CEPA province (except Guangdong) trades on average more than non-CEPA province with Hong Kong by

e0.775 = 2.170 million US dollars. Given Guangdong’s geographic, language, and

cultural proximity with Hong Kong, it is not surprise to find that it trades with Hong Kong more than any other province. However, CEPA did not enter into enforcement until 2004 and our dataset covers the period from 1998 to 2003, when there was not yet policy support to facilitate trade flow between Hong Kong and other CEPA members. Still, CEPA provinces trade more with Hong Kong with distance controlled. Hence, it is very possible CEPA is a natural trade area as Frank etc (1994) find APEC to be.

(18)

5. CONCLUSION

This paper is the first attempt to investigate trade partnership between Hong Kong and mainland China. Due to various economic development levels of inland provinces, Hong Kong and inland provinces benefit from trade between them in different ways and magnitude. With a panel dataset of 24 provinces or province-level region over a period from 1990 to 2004, we utilize gravity model to estimate the relationship between trade flow between Hong Kong and inland provinces and economic size, population, distance and a set of dummy variables. The regression estimates suggest that factors such as economic size, population, distance can make an impact on trade flow between mainland China and Hong Kong. In particular, trade mainly takes place between Hong Kong and provinces with a relatively developed economy and close distance to Hong Kong. In addition, significant coefficients of two dummy variables also witness close economic ties between the nine provinces of CEPA (including Guangdong) and Hong Kong given special cultural and geographic relations. A speeded up economic cooperation between CEPA members and Hong Kong is expected in the future when policy arrangement comes into force.

(19)

APPENDIX A

24 provinces or province-level region utilized in our sample are Beijing, Tianjin, Hebei, Shanxi1, Liaoning, Heilongjiang, Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Shangdong, Hunan, Guangdong, Guangxi, Hainan, Chongqing, Sichuan, Yunan, Shanxi2, Gansu, Ningxia and Xinjiang.

Seven provinces or province-level region discarded due to lack of data are Jilin, Hubei, Guizhou, Tibet, Inner Mongolia, Henan and Qinghai.

APPENDIX B

The likelihood ratio (LR) test for the pooled regression model against the FE model. The test statistic, under the null hypothesis u1 =u2 …=un =u, is

) 1 ( ~ ) 1 log( * + − 2 − = N RSS RSS RSS NT LR u u r χ where RSSr and RSSu represent the residual sums of squares in the restricted and

unrestricted models respectively. In this case, the restricted model is the pooled regression model and the unrestricted model is the FE model. Rejection of null hypothesis favours use of FE model.

The Hausman specification (HS) test for the RE model against the FE model. The test statistic, under the null hypothesis that RE model is the correct specification, is

) ( ~ ) ( ) ( )' (b b Var b b 1 b b 2 k HS = fe− re fe− re fe− re χ − where bfe and bre are estimators of regressors in the FE and RE models respectively; k

represents the number of regressors and Var is the variance-covariance matrix. Again, a large value of the test statistics favours the FE model.

APPENDIX C

The Lagrange multiplier (LM) test for the pooled regression model against the RE model. The test statistic, under the null hypothesis σ_u2 =0 , is

) 1 ( ~ ' ) ( ' 1 ) 1 ( 2 2 χ     ₋ ⊗ − = v v v J I v T NT LM N T where v is the vector of residuals, IN is an identity matrix of dimension N, JT a matrix

of ones of dimension T and ⊗ denotes the Kronecker product. Rejection of null hypothesis favours RE model against pooled regression model.

(20)

The following steps will give a value for the LM statistic for testing for heteroskedasticity in the linear regression model, yt = x’tβ + et.

1. Find OLS estimates b and the corresponding residuals êt = yt - x’tb. Also note the

error variance estimate

∑

= − = T t t K T 1 2 2 1 _ê σ .

2. Regress the squares of the OLS residuals ê2t on variables that you suspect may be

causing the heteroskedasticity. Often these variables will be the same as xt, but they

can be different. Note or calculate the value of the explained or regression sum of squares (SSR).

3. Compute the value of the LM statistic LM = SSR/2σ4.

4. If the null hypothesis of homoskedasticity is correct, LM has a 2 ) ( J

χ distribution

where J is the number of explanatory variables in the regression in step 2, excluding the constant. We reject if LM exceeds a critical value from the Ho

2 ) ( J

χ distribution.

Here since we are investigating heteroskedasticity along cross-sections the T in step one is replaced with N, number of cross-sections in our case. We first run pooled regression for Equation 3. In step 2 we regress ê2it on four size measuring variables

together and separately, lnYit, lnYht, lnPit and lnPht, potential causes of hetero. Table 3, 4

and 5 in the main text present the results.

APPENDIX E

With a large N and fixed T, if Var(v*i xi)=Ω(xi) where Ω(xi) is a symmetric matrix of order T containing unknown functions of x , the optimal estimator of β will _i

be or the form

This estimator is unfeasible ifΩ(x_i), the weights in mathematical form, is unknown.

In this GLS cannot be implemented because the variable exhibiting the same variance as Ω is not available. A feasible semi parametric GLS estimator would use instead a nonparametric estimator of E(vi*v*i' xi) based on within-group residuals. Under

(21)

(22)

BIBLIOGRAPHY

Aitken, B. J. & Harrison, A. E. 1999. Do Domestic Firms Benefit from Direct Foreign Investment? Evidence from Venezuela American Economic Review, Vol. 89, pp. 605-618

Arellano, M., Imbens, G., Mizon, G. E., Pagan, A, & Watson, M. (Ed) 2003. Panel

Data Econometrics. Oxford, England: Oxford University Press

Cheng, H. F. & Ruan, X. 2004. Regional Choice of International Direct Investment

from China: A Gravity Model Analysis. Institute of World Economics and Politics

Chinese Academy of Social Sciences. Working paper

Egger, P. & Pfaffermayr, M. 2003. The Proper Panel Econometric Specification of the Gravity Equation: A Three-Way Model with Bilateral Interaction Effects. Empirical

Economics. Vol. 28, July, 2003 DOI10.1007/s001810200146 Pages571-580

Filippini, C. & Molini, V. 2003. The determinants of East Asian trade flows: a gravity equation approach.Journal of Asian Economics Volume 14, Issue 5 , October 2003,

Pages 695-711

Frankel, J.E., Stein and Wei, S. 1994. Trading Blocs: The Natural, the Unnatural

and the Supernatural (Berkeley, CA: University of California. Centre for

International and Development Research, CIDER). Working Paper No. C94-034 Hill, R. C., Griffiths, W. E. & Judge, G. G. 2001. Undergraduate Econometrics 2nd Ed. John Wiley & Sons, Inc

Huang, D. & Huang, Y. 2006 Development of Trade Flow among Hong Kong, Mainland China and Taiwan: an Empirical Application of Gravity Model. Taiwan

Economic Forecast and Policies. Economic Research House of Central Research

Institution, 36:2 (2006), 47-75

Lin, J & Xia, Y. 2004. Empirical Analysis: Trade Flow between Hong Kong and Mainland China under CEPA. Guangdong Social Science. Vol 6, 2004

Liu, Z. 2002. Foreign Direct Investment and Technology Spillover: Evidence from China. Journal of Comparative Economics 30, 579-602

Martinez-Zaroso, I. & Nowak-Lehmann, F. 2003. Augmented Gravity Model: an Empirical Application to MERCOSUR-European Union Trade. Flows Journal of

Applied Economics, Vol. VI, No. 2 (Nov 2003), 291-316

(23)

Model of International Trade. The world Economyyr:1996 vol:19 iss:5 pg:533

Poyhonen, P. 1963 A Tentative Model for the Volume of Trade between Countries

Weltwirtschafliches Archive 1963 / 90 / P 93-100

Roberts, B. A. 2004 A Gravity Study of the Proposed China-ASEAN Trade Area.

International Trade Journal Winter2004, Vol. 18 Issue 4, p335-353, 19p, 2 charts;

DOI: 10.1080/08853900490518208; (AN 15314076)

Tinbergen, J. 1962 Shaping the World Economy: Suggestions for an International

Economic Policy. New York: Twentieth Century Fund