• No results found

Geographic differences in the prevalence of hypertension in Uganda: Results of a national epidemiological study

N/A
N/A
Protected

Academic year: 2021

Share "Geographic differences in the prevalence of hypertension in Uganda: Results of a national epidemiological study"

Copied!
22
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Groningen

Geographic differences in the prevalence of hypertension in Uganda

Lunyera, Joseph; Kirenga, Bruce; Stanifer, John W.; Kasozi, Samuel; van der Molen, Thys;

Katagira, Wenceslaus; Kamya, Moses R.; Kalyesubula, Robert

Published in: PLoS ONE

DOI:

10.1371/journal.pone.0201001

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2018

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Lunyera, J., Kirenga, B., Stanifer, J. W., Kasozi, S., van der Molen, T., Katagira, W., Kamya, M. R., & Kalyesubula, R. (2018). Geographic differences in the prevalence of hypertension in Uganda: Results of a national epidemiological study. PLoS ONE, 13(8), [0201001]. https://doi.org/10.1371/journal.pone.0201001

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

https://doi.org/10.1007/s00181-018-1452-5

The spatial empirical Bayes predictor of the small area

mean for a lognormal variable of interest and spatially

correlated random effects

Dian Handayani1, 2, 3 · Henk Folmer2, 4 · Anang Kurnia3 · Khairil Anwar Notodiputro3

Received: 8 February 2017 / Accepted: 11 January 2018 / Published online: 26 May 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018

Abstract The standard small area estimator, the empirical best linear unbiased

pre-dictor (EBLUP), estimates small area parameters by way of linear mixed models. The EBLUP assumes normal and independent random small area effects as well as normal and independent random sampling errors. Under these assumptions, the variable of interest also follows a normal distribution. In practice, however, the above assumptions are often violated. The variable of interest is often non-normal and highly skewed, and the small areas are frequently spatially dependent. In this paper, we propose the spatial empirical Bayes predictor (SEBP) of the small area mean of a positively skewed variable of interest in the presence of spatial dependence among the random small area effects. We assume that the variable of interest follows a normal distribution after a log transformation and that its log transform is linked to some auxiliary variables by a nested error regression model. The SEBP is derived under the log-transformed nested error regression model. By way of simulation, we show that compared to its

alterna-B

Khairil Anwar Notodiputro khairil@apps.ipb.ac.id Dian Handayani dianh@unj.ac.id; dian99163@yahoo.com Henk Folmer h.folmer@rug.nl Anang Kurnia anangk@apps.ipb.ac.id

1 Department of Statistics, State University of Jakarta, Jakarta, Indonesia

2 Faculty of Spatial Sciences, University of Groningen, Groningen, The Netherlands 3 Department of Statistics, Bogor Agricultural University, Bogor, Indonesia

4 College of Economics and Management, Northwest Agriculture and Forestry University, Yangling,

(3)

tives, i.e., the direct estimator which is solely based on the survey data for the small area under study, the EBLUP which does not take into account spatial dependence and skewness, the empirical Bayes predictor which takes into account skewness but not spatial dependence among the small areas, the SEBP has the smallest average relative bias and average relative root-mean-squared error for various combinations—though not all—of skewness and spatial correlation.

Keywords Small area estimation · Spatial empirical Bayes predictor · Spatial

dependence· Skewed distribution

1 Introduction

Surveys usually only allow parameter estimation with an acceptable level of precision for large areas (for instance, the national, state, or provincial level). In recent years, the need for parameter estimates at lower administrative level (subpopulation or small areas) has increased. For example, to allocate funds to local governments, information about local parameters, such as poverty rates, unemployment rates, or mortality rates, is needed. However, data sets from surveys are often not large enough to allow direct estimation of subpopulation parameters with acceptable precision (“direct” refers here to an estimator based on the survey data for the subpopulation only). Such subpopula-tions are called “small areas” (Rao and Molina2015), and the field of statistics dealing with estimation of parameter for such areas is called “small area estimation” (SAE).

Standard SAE method, EBLUP (empirical best linear unbiased predictor), is based on a linear mixed model (LMM) which relates the variable of interest to some fixed effects and random effects (McCulloch and Searle 2001). In EBLUP, the auxiliary information is considered as fixed effect and the small areas are considered as ran-dom effect. The variable of interest is assumed to follow normal distribution. The random area effect and the sampling errors are also assumed to follow normal distri-bution. Furthermore, there is no spatial dependence among the random area effects as well as among the sampling errors. In practice, however, these assumptions are frequently violated. In socioeconomic surveys, for example, the variable of interest is often skewed. Berg and Chandra (2014), Chandra and Chambers (2011), Bellow and Lahiri (2011), Wang and Fuller (2003), Slud and Maiti (2006), Kurnia and Chambers (2011) developed SAEs for non-normal, skewed, continuous variables of interest. Berg and Chandra (2014), Slud and Maiti (2006), Kurnia and Chambers (2011) proposed the empirical Bayes predictor (EBP) for a lognormally distributed variable of interest. Although the above references correct for non-normality and skewness, they assume independence among small areas.

In practice, the assumption of independence among the small areas is also frequently violated; there is often spatial dependence among them. Petrucci and Salvati (2004a,

b), Salvati (2004), Petrucci and Salvati (2006), Pratesi and Salvati (2008), Molina et al. (2009) introduced the spatial empirical best linear unbiased predictor (SEBLUP) which relaxes the assumption of independence among the small areas. The above references showed that the estimated SEBLUP standard errors tend to be smaller than the estimated EBLUP standard errors, if there is strong spatial dependence among the

(4)

small areas. Although the above references take spatial dependence among the small areas into account, they still assume a normal distribution for the variable of interest. In practice, both a skewed variable of interest and spatial dependence among the small areas usually occur. Therefore, we propose the spatial empirical Bayes predictor (SEBP) which can be applied to account for both skewness and spatial dependence among the small areas. Our study assumes that the variable of interest is normally distributed after log transformation. Furthermore, the relationship between the log-transformed variable of interest and the auxiliary variables is assumed linear and is modeled by a nested error regression model.

The paper is organized as follows. Section2 summarizes the EBLUP and EBP, and Sect.3presents the SEBP. Section4presents a Monte Carlo simulation of the performance of the SEBP compared to the EBP, EBLUP, and the direct estimator. Section5concludes.

2 A review of small area estimators based on a linear mixed model

2.1 The empirical best linear unbiased predictor (EBLUP)

Before going into detail, we note that population characteristics will be denoted by capitals and sample characteristics by lower cases. Consider a population U divided into M non-overlapping small areas denoted Ui, i  1, 2 . . . M. Ui consists of Ni

elements so that population U has N elements(N  N1+ N2+· · · + Nm). Let for

element j in the i th small area, p auxiliary data xi j 



x1i j, x2i j,. . . , xpi j

T

be available for variable of interest yi j through the nested error regression model:

yi j  xi jTβ + zi jvi + ei j j  1, 2 . . . Ni; i  1, 2 . . . M (1)

whereβ is the vector of regression parameters, zi jis a known positive constant, andvi

and ei j are random small area effects and random sampling errors, respectively, with

distributionvi ∼ N



0,σv2and ei j ∼ N



0, σe2. Furthermore,viand ei jare assumed

mutually independent. Hence, the variable of interest follows the normal distribution

yi j ∼ N



xi jTβ, z2i jσv2+σe2

 .

Model (1) can be written in matrix notation as:

yi  Xiβ + Zivi + ei, i 1, 2 . . . M (2)

where yi 



yi 1, yi 2,. . . yi Ni T

is the(Ni× 1) vector of the variable of interest,

Xi 



xi 1, xi 2. . . xi Ni T

the(Ni× p) matrix of unit-specific auxiliary variables, β the

(p × 1) vector of regression coefficients, Zi the(Ni× 1) vector with elements equal

to positive constant,vithe random small area effect associated with small area i , and ei

the(Ni× 1) vector of random sampling error. We assume that vi ∼ N

 0,σ2 v1Ni1 T Ni  ; ei ∼ N  0,σe2INi 

(5)

with all elements equal to one and INi is(Nix Ni) identity matrix. For the population

U, model (2) can be written in matrix notation as:

Y  Xβ + Zv + e (3)

where Y is the(N × 1) vector of the variable of interest, X the (N × p) matrix of unit-specific auxiliary variables ,β the (p × 1) vector of regression coefficients, Z a

(N × M) matrix (defined below), v the (M × 1) vector of random small area effect,

and e the(N × 1) vector of random sampling errors. Z is usually specified as (Petrucci and Salvati2004b): Z  ⎡ ⎢ ⎢ ⎢ ⎣ 1N1 0 · · · 0 0 1N2 · · · 0 .. . ... . .. ... 0 0 · · · 1NM ⎤ ⎥ ⎥ ⎥ ⎦.

The mean of Y in (3) is Xβ, and its covariance matrix is V  σv2Z ZT+σe2IN. The

covariance matrix V has dimensions(N x N) .

In this paper, we assume that each small area in population U is sampled, although in some applications, it could happen that some of small areas are not sampled. We also assume that there is no bias in sampling the small areas so that population model (1) holds for the sampled small areas:

yi j  xi jTβ + zi jvi+ ei j; i 1, 2 . . . M; j  1, 2 . . . ni. (4)

Suppose the parameter of interest is the it hsmall area mean defined as:

μi  1 Ni Ni  i1 yi j  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri yi j⎦ ; i  1, 2 . . . M (5) where si denotes the units sampled in small area i and rithe non-sampled units. There

are ni units in si and(Ni− ni) units in ri. Sample size ni is assumed unequal to zero

but too small to produce a direct estimator ofμi with adequate precision. Note that



j∈ri yi j in (5) is the total of non-sampled yi j. Under model (4), the non-sampled values of yi j will be estimated by ˆyi j  E



yi j|vi

  xT

i j ˆβ + zi jvi.

Under model (4), the best linear unbiased predictor (BLUP) ofμiis:

ˆμBLUP i  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri ˆyBLUP i j⎦ ; i  1, 2 . . . M (6) where ˆyi jBLUP  Eyi j|vi

  xi jT ˆβ + zi jvi  xi jT ˆβ + zi jγi  ¯yi s− ¯xi sT ˆβ  with γi  (z2i jσv2) (zi j2σ2 v+σ2

(6)

(the ratio between the model variance relative to the total variance); ˆβ  M i1xixiT/ σ2 e ni +σ 2 vz2i j −1M i1xiyiT/ σ2 e ni +σ 2 vz2i j   M i1XiTVi−1Xi −1 M i1XiTVi−1yi   XTV−1X−1XTV−1Y is the

weighted least squares estimator ofβ; and ¯yi sand ¯xi sare the sample averages of yi j

and xi j, respectively.

In practice, the variances componentsσv2andσe2are usually unknown. They can

be estimated from the sample data using restricted maximum likelihood (REML) or maximum likelihood (ML). See Rao and Molina (2015) for details. By replacing 

σ2

v,σe2



in (6) by their estimatesˆσv2, ˆσe2, the empirical best linear unbiased predictor (EBLUP) ofμi is obtained as:

ˆμEBLUP i  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri ˆyEBLUP i j⎦ ; i  1, 2 . . . M (7) where ˆyi jEBLUP  xi jT ˆβ + zi jˆvi  xi jT ˆβ + zi jˆγi

 ¯yi s− ¯xi sT ˆβ  ; ˆγi  (z2 i jˆσv2) (z2 i jˆσv2+ˆσe2/ni); ˆβ  M i1 xixiT/ ˆσ2 e ni + ˆσ 2 vzi j2 −1M i1 xiyiT/ ˆσ2 e ni + ˆσ 2 vz2i j  .

An approximate estimator of the MSEˆμEBLUPi subject to approximations of order

Om−1is given by (Prasad and Rao1990):  MSE  ˆμEBLUP i   mseˆθEBLUP i   g1i  ˆσ2 v,ˆσe2  + g2i  ˆσ2 v,ˆσe2  + 2g3i  ˆσ2 v,ˆσe2  (8) where g1i  ˆσ2 v, ˆσe2 

is due to the prediction of the small area effectvi and is O(1),

g2i 

ˆσ2

v, ˆσe2



is due to estimatingβ and is Om−1for large m, and g3i 

ˆσ2

v, ˆσe2

 is due to estimatingσv2and is Om−1, where m is the number of small areas that are selected in the sample. For details about the derivation of MSEˆμEBLUPi , we refer to Prasad and Rao (1990) and Rao and Molina (2015).

2.2 The empirical Bayes predictor (EBP) of the lognormal response

In many socioeconomic analyses, especially for income or expenditure problems, the variable of interest is positively skewed. In this case, EBLUP is an inaccurate estimator of small area parameters because of violation of the normality assumption. Berg and Chandra (2014), Kurnia and Chambers (2011), Slud and Maiti (2006) developed the empirical Bayes predictor (EBP) which assumes that the variable of interest follows a lognormal distribution, yi j ∼ L N



μi,σ2



and the log transform of yi j a normal

distribution. The log transform of yi j is linked to some auxiliary variables under the

nested error regression model as follows:

(7)

where vi and ei j are mutually independent error terms withvi ∼ N  0,σv2 and ei j ∼ N  0,σ2 e 

. Then, yi jfollows a normal distribution with mean xi jTβ and variance

z2i jσv2+σe2; yi j∼ N  xi jTβ, zi j2σv2+σe2  . Because yi j ∼ L N  μi,σ2 

and based on the properties of the lognormal distribu-tion, the mean of yi jis:

Eyi j



 μi  e

xi jTβ+12zi j2σv2+σe2

(10) and its variance is:

Varyi j   σ2 e2  xi jTβ+12  zi j2σv2+σe2 ⎡ ⎣e  ez2i j σv +σ2 e2 − 1 ⎤ ⎦ (11)

Suppose the parameter of interest is the i th small area meanμi  N1

i Ni i1yi j  1 Ni  j∈si yi j+  j∈ri yi j 

. To obtainj∈ri yi j, Kurnia and Chambers (2011) esti-mated the non-sampled values of yi jas ˆyi j  E



yi j



. Then, the Bayes predictor (BP)

forμi is: ˆμBP i  1 Ni ⎡ ⎣ j∈si yi j +  j∈ri ˆyBP i j ⎤ ⎦ (12) where ˆyi jBP ex T i jˆβ+12  z2i jσv2+σe2 .

In practice,σv2andσe2are unknown and they are estimated byˆσv2and ˆσe2, respec-tively, using REML or ML. The empirical Bayes predictor (EBP) ofμiis obtained by

replacing the unknown parameterσv2andσe2by their estimates:

ˆμEBP i  1 Ni ⎡ ⎣ j∈si yi j +  j∈ri ˆyEBP i j ⎤ ⎦ (13)

where ˆyi jEBP ex T i jˆβ+ 1 2  z2 i jˆσv2+ˆσe2  .

If yi jis not strictly lognormally distributed, thenˆμEBPi tends to be biased. Following

Karlberg (2000), Kurnia and Chambers (2011) proposed the following bias correction for (13): ci j  1 + 1 2xi jVar  ˆβxi jT + 1 8∇  z2i jˆσv2+ˆσe2  (14) where ∇ (.) is the asymptotic variance–covariance matrix of total variability of response.

(8)

As a result, the EBP ofμi with bias correction (14) is: ˆμEBP i  1 Ni ⎡ ⎣ j∈si yi j +  j∈ri ˆyEBP i j ⎤ ⎦ (15)

where ˆyi jEBPcci j

−1 ex T i jˆβ+ 1 2  z2 i jˆσv2+ˆσe2  ; ci j  1+12xi jV  ˆβxi jT+18∇  z2i jˆσv2+ ˆσe2  with V  ˆβXTV−1XT −1 the variance of ˆβ

3 The spatial empirical Bayes predictor (SEBP)

The EBP (15) is derived under model (9) with the assumption that the random small area effectsviare independent. However, in many applications, the response of unit in

a given area toward certain characteristics often could influence the response from unit in other area. For example, house prices in neighboring small areas tend to mutually influence each other. In other words, there is spatial dependence among small areas.

Spatial dependence among small areas can be taken into account by spatially cor-related random small area effects. Model (9) for unit i in spatially correlated small area j thus becomes:

yi j#  log yi j  xi jTβ + zi jui+ ei j; i  1, 2 . . . M, j  1, 2 . . . Ni (16)

where ui is the random small area effect which is assumed to follow a spatial error

autoregressive (SAR) process with spatial correlation coefficientρ and spatial weights matrix W .

Model (16) for the population can be written in matrix notation as follows:

Y# Xβ + Zu + e (17) where Y#   y#11. . . y1N1# . . . y#M1. . . y1N# M T

 (log y11,. . . log y1N1, . . . log yM1,

. . . log yM NM) T, u  ρWu + v ⇒ u  (I − ρW)−1v; v ∼ N0, σ2 vI, eN0, σ2 eI 

. Furthermore, the random small area effect u is assumed to follow a

normal distribution: u ∼ N (0, D) ; D  σv2I− ρWT(I − ρW)−1. Note that the dimension of Y# is (N × 1), N  N1 + N2 +· · · + NM, X is (N × p), Z

is (N × M),u is (M × 1) , v is (M × 1), and the weights matrix W is (M × M). Although there are many types of weights matrix W , we apply the symmetric binary weights matrix with elements{wi k}  1, if small area i is adjacent to small

area k and zero otherwise(i  k  1, 2 . . . M). The matrix W is row-standardized such that the row elements sum to one. Thus, yi j# follows a normal distribution:

yi j#  log yi j ∼ N



xi jTβ, z2i jτi2+σe2



;τi2  biTDbi with bTi a(1x M) vector with

(9)

We assume that there is no sampling bias for small areas so that population model (17) holds for the sampled small areas:

yi j#  log yi j  xi jTβ + zi jui+ ei j; i  1, 2 . . . M, j  1, 2 . . . ni. (18)

Suppose the parameter of interest is i th small area mean,μi  N1

i Ni i1yi j  1 Ni  j∈si yi j+  j∈ri yi j 

. Under model (18) and following Kurnia and Chambers (2011), the spatial Bayes predictor (SBP) forμiis given by:

ˆμSBP i  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri ˆμSBP i j⎦ ; i  1, 2 . . . M, j  1, 2 . . . ni (19) where ˆyi jSBP  Eyi j   exi jTˆβ+ 1 2  z2 i jτi2+σe2 

,τi2  bTi Dbi. The ˆyi jSBP is obtained by

using the property of the lognormal distribution (note that if yi j ∼ L N

 μi,σ2  , then μi  E  yi j   exi jTβ+12  z2i jσi2+σe2  ).

The spatial empirical Bayes predictor (SEBP) ofμi,ˆμSEBPi , is derived by replacing

 σ2 v,σe2,ρ  in (19) by the estimatesˆσv2, ˆσe2, ˆρ  . The estimatesˆσv2, ˆσe2, ˆρ  can be obtained by ML or REML. As in the case of the EBP proposed by Kurnia and Chambers (2011), if the log-transformed variable of interest does not strictly follow a normal distribution, then ˆμSEBPi tends to be biased forμi. In that case, the modified Karlberg

bias correction factor (see “Appendix 1”) can be applied to ˆμSEBPi as follows: ˆμSEBP_c i  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri ˆySEBP_c i j⎦ ; i  1, 2 . . . M (20) where ˆySEBP_c i j   cSEBPi j −1 ex T i jˆβ+12  z2i jˆτi2+ˆσe2  cSEBPi j  1 +1 2  D1V  ˆβ+ D2Vˆρ+ D3V  ˆσ2 v  + D4V  ˆσ2 e  D1 xi jxi jT; D2 [A1A2A1A2+ A3A2+ A1A4] ; D3 A5A5; D4 1 4; D5 1 4A5 A1 1 2z 2 i jbTiσv2  I− ρWT(I − ρW)−1 A2  WT+ W− 2ρWTW I− ρWT(I − ρW)−1bi A3 1 2z 2 i jbTiσv2  I− ρWT(I − ρW)−1WT+ W− 2ρWTW I− ρWT(I − ρW)−1 A4  −2WTW I− ρWT(I − ρW)−1b i+  WT+ W− 2ρWTW  I− ρWT(I − ρW)−1WT+ W− 2ρWTW I− ρWT(I − ρW)−1bi 

(10)

A5 1 2z 2 i jbTi  I− ρWT(I − ρW)−1bi.

The MSE of ˆμSEBP_ci can be approximated as: MSE  ˆμSEBP_c i  ≈ Var ⎧ ⎨ ⎩Ni−1  j∈si yi j ⎫ ⎬ ⎭+ Ni−2 ⎧ ⎨ ⎩ (Ni−ni) j1 e2x T i jβ+  z2 i jτi2+σe2   e  z2i jτi2+σe2  − 1  + 2  1≤ j<k≤(Ni−ni) E  xi jTβ + zi jui   xi kTβ + zi kui  −exi jTβ+12  z2i jτi2+σe2  exi kTβ+12  z2i kτi2+σe2 ⎫⎬ ⎭+ Ni−2  A2Vˆρ+∇  B2ˆσv2+ Cˆσe2  ; where∇ (.) is the asymptotic variance–covariance matrix total variability of response. Its derivation is presented in “Appendix 2.” Since we have not yet derived the estimator of the MSE of ˆμSEBP_ci , we do not use it any further. The relative performance of

ˆμSEBP_c

i is analyzed by means of simulation in the next section.

4 The performance of the SEBP compared to the EBP, EBLUP,

and the direct estimator: evidence from Monte Carlo simulation

This section presents the results of a simulation of the performance of the SEBP compared to the EBP, EBLUP, and the direct estimator. To this end, the values of log yi j

are generated based on model (18) with single auxiliary variable X . The number of small areas is set at 30. To get insight into the impacts of the small area sample size, we consider small, medium, and large sample size ranging between 5 and 35. As a rule of thumb, samples larger than 30 are considered “large.” We set the total sample size at 600 which is approximately 3% of the total population. Consequently, the population will have 20,000 elements (N20,000).

We generate random effectsv from N (0, 0.09) and random sampling errors e from

N(0, 0.25). We generate two sets of Y values which follow lognormal distributions

with the same mean but different variances. To obtain the two sets of Y values, we fix

β  (β0,β1)  (2, 1) and generate X ∼ N (8.5, 4) and X ∼ N (6, 9), respectively. The generated X values from X ∼ N (8.5, 4) will yield Y values with smaller variance than the Y values based on X ∼ N (6, 9). We fix the spatial correlation at ρ  0.25, 0.5, 0.75 to represent small, moderate, and large spatial correlation, respectively.

Based on the above specifications, there are six synthetic spatial populations: Popu-lations 1–3 with X ∼ N (8.5, 4), ρ  0.25, ρ  0.5, and ρ  0.75, respectively, and Populations 4–6 with X ∼ N (6, 9), ρ  0.25, ρ  0.5, and ρ  0.75, respectively. The proximity matrix W is a symmetric binary weights matrix withwi j equal to 1 if

small area i is adjacent to small area j , and 0 otherwise. Further explanation about optimum spatial weighted matrix in SAE can be found in Asfar et al (2016). Based on these specifications and under model (19), the values of logyi j



for the six synthetic populations are determined. The transformation yi j  exp

 logyi j



(11)

the basis of which the means Eyi j



of all six synthetic populations are calculated. These means are the “true” population means.

For each synthetic population, we draw a 3% sample. Then, we combine the sample values of the 30 small areas to estimate the model parameters (β, σ2

v,σe2,ρ). These

parameters are needed to obtain the SEBP (21). By combining the sample values, we “borrow strength,” i.e., enlarge the data set to estimate the model parameters. Note that “borrowing strength” is an essential feature of SAE. To obtain small area estimators with adequate precision, one often applies indirect estimators that “borrow strength” by using values of the variable of interest from related areas and/or time periods which thus increases the “effective” sample size. These values are used in the estimation process through a statistical model.

Based on the combined sample values and under model (18),σv2,σe2,ρ are estimated

using REML. The advantage of using REML instead of ML to estimateσv2,σe2,ρ

 is that it takes into account the loss of degrees of freedom due to estimatingβ (Rao and Molina2015). Furthermore, Jiang (1996) showed that REML is asymptotically consistent when normality does not hold. Based on the estimatesˆσv2, ˆσe2, ˆρ, we obtain

ˆβ and the estimates of the random area effects ui. Next, we calculate the conditional

mean Eyi j



under model (18) which is used as an estimate of the non-sampled yi j.

Next, the predictions of the non-sampled yi j are combined with the sampled yi j. The

combined values are used to estimate the small area means (model 18). The procedure is applied to each of the six synthetic population.

For each synthetic population, T  1000 replicates are generated. For each sample, the population mean is estimated by the SEBP, EBP, EBLUP, and direct. We evaluate these estimators by means of the average relative bias (ARB) and the average relative root-mean-squared error (ARRMSE) which are defined as:

ARB 1 M M  i 1 T T  t1 ˆμ i t μi − 1  × 100% (21) ARRMSE 1 M M  i " # # $ 1 T T  t1 ˆμ i t μi − 1 2 × 100% (22)

The results of the simulations are reported in Tables1and2as well as in Figs.1and

2. The results in Table1and Fig.1are for moderately skewed data (X ∼ N (8, 5, 4)) and those in Table2and Fig.2for heavily skewed data (X∼ N (6, 9)).

Table1and Fig.1show that for moderately skewed data, direct is best in terms of ARB, except forρ  0.5, but worst in terms of ARRMSE everywhere. Overall, the EBLUP has the largest ARB, while its ARRMSE is smaller than that of the direct but larger than that of the EBP and the SEBP. In terms of ARB, the SEBP outperforms the EBLUP and the EBP everywhere and the direct forρ  0.5. In terms of ARRMSE, the EBP and the SEBP perform approximately equally well and outperform their alternatives.

Table2and Fig.2show that for heavily skewed data, direct outperforms its alterna-tives in terms of ARB, but its ARRMSE is worst forρ  0.25 and the next worst for

(12)

Table 1 ARB and ARRMSE of direct, EBP, EBLUP, and SEBP for moderately skewed data

ρ Measure Direct EBLUP EBP SEBP

ρ  0.25 ARB(%) 0.70 15.77 − 5.21 − 1.22 ARRMSE(%) 144.38 42.20 12.57 13.40 ρ  0.5 ARB(%) 0.72 11.31 − 4.91 0.48 ARRMSE(%) 133.67 20.79 8.63 9.46 ρ  0.75 ARB(%) 0.88 12.93 − 7.88 − 4.99 ARRMSE(%) 123.95 23.93 8.37 8.46

Table 2 A R B and A R R M S E of direct, EBP, EBLUP, and SEBP for heavily skewed data

ρ Measure Direct EBLUP EBP SEBP

ρ  0.25 ARB(%) − 1.77 64.88 − 5.36 − 2.78 ARRMSE(%) 512.17 486.50 13.84 14.36 ρ  0.5 ARB(%) 0.14 120.19 6.46 9.37 ARRMSE(%) 548.17 1689.75 25.97 27.83 ρ  0.75 ARB(%) 0.31 133.89 9.98 6.93 ARRMSE(%) 442.28 3799.33 67.74 63.70 (b) (a) -10 -5 0 5 10 15 20 0.25 0.50 0.75

DIRECT EBLUP EBP SEBP

0 50 100 150 200 0.25 0.50 0.75

DIRECT EBLUP EBP SEBP

Fig. 1 The ARB (a) and ARRMSE (b) of direct, EBP, EBLUP, and SEBP for moderately skewed data

-50 0 50 100 150 0.25 0.50 0.75

DIRECT EBLUP EBP SEBP 0 1000 2000 3000 4000 0.25 0.50 0.75

DIRECT EBLUP EBP SEBP

(b) (a)

(13)

ρ  0.5. In terms of ARRMSE, the EBP and the SEBP perform approximately equally

well. Particularly, the EBP has slightly smaller ARRMSE for small and medium spatial correlation. For strong spatial correlation, the SEBP clearly outperforms EBP.

The following overall conclusions can be drawn from the simulations. First, although direct has the smallest average relative bias, its average relative root-mean-squared error is so large that its applicability is very limited indeed. Secondly, the results for the EBLUP show that failure to control for non-normality leads to substan-tial bias and root-mean-squared error. This applies to both moderately and heavily skewed data. Thirdly, taking into account spatial dependence among the random area effects tends to improve the bias and, to a less extent, the root-mean-squared error, though not uniformly, as shown by the comparison between the EBP and the SEBP.

5 Concluding remarks

Surveys usually only allow parameter estimation with an acceptable level of precision for large areas (for instance, the national, state, or provincial level). In recent years, the need for parameter estimates at lower administrative levels (subpopulations or small areas) has increased. For example, to allocate funds to local governments, information about local parameters, such as the poverty rate, the unemployment rate, or the mortal-ity rate, is needed. Unfortunately, data sets from surveys are often not large enough to yield direct estimators of small area parameters with acceptable precision. Small area estimation (SAE) has been developed as a subfield of statistics to deal with estimation of parameters for small areas.

Standard SAE methods based on a linear mixed model assume normality and inde-pendence among random small area effects as well as normality and indeinde-pendence among the random sampling errors. In practice, however, these assumptions are fre-quently violated. In socioeconomic surveys, for example, the variable of interest is typically skewed. Moreover, there is often spatial dependence among small areas.

In this paper, we propose the spatial empirical Bayes predictor (SEBP) of the small area mean of a positively skewed variable of interest in the presence of spatial dependence among the random small area effects. The SEBP is derived under a log-transformed nested error regression model. By means of simulation, we analyzed the performance of the SEBP relative to the direct estimator (direct), the empirical best linear predictor (EBLUP) which does not take into account spatial dependence and skewness, and to the empirical Bayes predictor (EBP) which takes into account skewness but not spatial dependence among the small areas. The main finding is that in terms of average relative bias and average root-mean-squared error, the SEBP performs well, in the case of both moderately and heavily skewed data. Specifically, the SEBP has the smallest average relative bias and average relative root-mean-squared error for various combinations, though not all, of skewness and spatial correlation.

The simulations indicate that SEBP performs relatively well. It has the smallest bias compared to EBP, EBLUP, and direct estimator. Based on the results, the SEBP could potentially have smaller MSE. We have derived the theoretical mean-squared error of the SEPB M S EˆμiS E B Pin this paper but not its estimator. To obtain the estimate of the MSE, we will investigate the expectation of the MSE. Derivation of the latter is a next step for our works.

(14)

Acknowledgements The authors thank the reviewers for helpful comments and suggestions which led to considerable improvements in the paper.

Appendix 1: The bias correction for

ˆy

i jSEBP

if the log-transformed variable

of interest does not strictly follow a normal distribution

The spatial empirical Bayes predictor (SEBP) ofμiunder model (18) is:

ˆμSEBP i  1 Ni ⎡ ⎣ j∈si yi j +  j∈ri ˆySEBP i j ⎤ ⎦ (A1.1)

where ˆyi jSEBP  Eyi j|

 yi s, xi s   exi jTˆβ+12  z2 i jˆτi2+ˆσe2  ;ˆτi2  bTi ˆDbi  biTˆσv2I− ˆρWT I− ˆρW−1bi; bTi is a(1 × M) vector (0,0,…0,1,0,0…) with 1

for the i th area, ˆβ  % M  i1 xixiT & σ2 e ni +σv2zi j2 '−1%M i1 xiyiT & σ2 e ni +σv2z2i j ' Suppose:ψi j(η)  e xT i jβ+ 1 2  z2 i jτi2+σe2  ;τ2 i  bTi σv2   I − ρWT  (I − ρW)−1bi;η   β, ρ, σ2 v,σe2  ∂βψi j(η)  ψi j(η) xi jT 2 ∂β2ψi j(η)  ψi j(η) xi jx T i j ∂ρψi j(η)  ψi j(η)  1 2z 2 i j ∂ρτ 2 i   ψi j(η)  1 2z 2 i j ∂ρb T i σ 2 v  I− ρWT  (I − ρW)−1bi   ψi j(η) 1 2z 2 i jbiTσv2  I− ρWT  (I − ρW)−1  WT + W− 2ρWTW   I− ρWT  (I − ρW)−1bi  ψi j(η) A1A2; A1 1 2z 2 i jb T i σ 2 v  I− ρWT  (I − ρW)−1 A2  WT + W− 2ρWTW   I− ρWT  (I − ρW)−1bi

(15)

2 ∂ρ2ψi j(η)  ∂ρ  ∂ρψi j(η)   ∂ρ  ψi j(η) A1A2   ∂ρ[B A2] ; B ψi j(η) A1 2 ∂ρ2ψi j(η)   ∂ρB  A2+ B  ∂ρA2  ∂ρB  ∂ρψi j(η)  A1+ψi j(η)  ∂ρA1   ψi j(η) A1A2A1+ψi j(η) A3  ψi j(η) [A1A2A1+ A3] ∂ρA1 1 2z 2 i jb T i σ 2 v  I− ρWT  (I − ρW)−1WT + W− 2ρWTW   I− ρWT  (I − ρW)−1 A3 ∂ρA2  ∂ρ  WT + W− 2ρWTW   I − ρWT  (I − ρW)−1bi +  WT + W− 2ρWTW  ∂ ∂ρ  I− ρWT  (I − ρW)−1bi  −2WT W   I− ρWT  (I − ρW)−1bi+  WT + W− 2ρWTW   I− ρWT  (I − ρW)−1WT + W− 2ρWTW   I− ρWT  (I − ρW)−1bi   A4 2 ∂ρ2ψi j(η)  ∂ρ[B A2]  ∂ρB  A2+ B  ∂ρA2  2 ∂ρ2ψi j(η)  ψi j(η) [A1A2A1+ A3] A2+ψi j(η) A1A4  ψi j(η) [A1A2A1A2+ A3A2+ A1A4] ∂σ2 vψi j(η)  ψi j(η) ∂σ2 v  1 2z 2 i jτi2   ψi j(η) 1 2z 2 i j ∂σ2 v  τ2 i   ψi j(η) 1 2z 2 i jb T i  I− ρWT  (I − ρW)−1bi  ψi j(η) A5; A5 1 2z 2 i jb T i  I − ρWT  (I − ρW)−1bi ∂σ2 e ψi j(η)  ψi j(η) ∂σ2 e  1 2σ 2 e  1 2ψi j(η) 2 2σ2 e ψi j(η)  ∂σ2 e  ∂σ2 e ψi j(η)  ∂σ2 e  1 2ψi j(η) 

(16)

 1 2  ∂σ2 e ψi j(η)   1 4ψi j(η) 2 ∂σ2 v∂σe2 ψi j(η)  ∂σ2 v  ∂σ2 e ψi j(η)  ∂σ2 v  1 2ψi j(η)   1 2  ∂σ2 vψi j(η)   1 2ψi j(η) A5 The second-order Taylor approximation ofψi j

 ˆηis given by: ψi j  ˆη≈ ψi j(η) +  ˆη − ηT ψi j(1)(η) + 1 2  ˆη − ηT ψi j(2)(η)  ˆη − η. (A1.2) By taking the expectation on the left-hand side and the right-hand side of A1.2, we obtain: Eψi j  ˆη≈ Eψi j(η)  + Eˆη − ηi j(1)(η)  + E  1 2  ˆη − ηT ψi j(2)(η)ˆη − η  ψi j(η) + E  1 2  ˆη − ηT ψi j(2)(η)ˆη − η (Note: Eˆη − ηi j(1)(η) 

 0 because ˆη is assumed to be unbiased estimator forη) Eψi j  ˆη ψi j(η) + 1 2E  ˆη − ηT ψi j(2)(η)  ˆη − η  ψi j(η) + 1 2tr  ψi j(2)(η) E  ˆη − η ˆη − ηT  ψi j(η) + 1 2tr ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ ⎡ ⎢ ⎢ ⎣ C1 0 0 0 0 C2 0 0 0 0 C3C5 0 0 C5C4 ⎤ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎣ V ( ˆβ) 0 0 0 0 V (ˆρ) 0 0 0 0 V (ˆσv2) Cov( ˆσv2, ˆσε2) 0 0 Cov( ˆσv2, ˆσε2) V (ˆσε2) ⎤ ⎥ ⎥ ⎦ ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ where C1 2 ∂β2ψi j(η)  ψi j(η) xi jx T i j C2 2 ∂ρ2ψi j(η)  ψi j(η) [A1A2A1A2+ A3A2+ A1A4] C3 2 2σ2 vψi j(η)  ψi j(η) A5 A5 C4 2 2σ2 e ψi j(η)  1 4ψi j(η)

(17)

C5 2 ∂σ2 v∂σe2 ψi j(η)  1 2ψi j(η) A5 Alternatively, Eψi j 

ˆηalso can be written by:

Eψi j  ˆη ψi j(η) + 1 2tr ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ψi j(η) ⎡ ⎢ ⎢ ⎣ D1 0 0 0 0 D2 0 0 0 0 D3 D5 0 0 D5 D4 ⎤ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎣ V ( ˆβ) 0 0 0 0 V (ˆρ) 0 0 0 0 V (ˆσv2) Cov( ˆσv2, ˆσε2) 0 0 Cov( ˆσv2, ˆσε2) V (ˆσε2) ⎤ ⎥ ⎥ ⎦ ⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭ where D1 xi jxi jT D2 [A1A2A1A2+ A3A2+ A1A4] D3 A5A5 D4 1 4 D5 1 4A5 Then, Eψi j  ˆη≈ ψi j(η) + E  1 2  ˆη − ηT ψi j(2)(η)  ˆη − η  ψi j(η) + 1 2tr  ψi j(2)(η) E  ˆη − ηT ˆη − η  ψi j(η) + 1 2  ψi j(η)  D1V  ˆβ+ D2V  ˆρ+ D3V  ˆσ2 v  + D4V  ˆσ2 e   ψi j(η)  1 + 1 2  D1V  ˆβ+ D2V  ˆρ+ D3V  ˆσ2 v  + D4V  ˆσ2 e  (A1.3)

Based on the result (A1.3), it can be seen thatψi j



ˆηis not unbiased estimate of

ψi j(η) .. The bias correction factor for ˆyi jSEBPis given by:

cSEBPi j  1 +1 2  D1V  ˆβ+ D2V  ˆρ+ D3V  ˆσ2 v  + D4V  ˆσ2 e  (A1.4) As a result, the spatial empirical Bayes predictor (SEBP) ofμiwith bias correction

is given by: ˆμSEBP_c i  1 Ni ⎡ ⎣ j∈si yi j +  j∈ri ˆySEBP_c i j ⎤ ⎦ (A1.5)

(18)

where ˆyi jSEBP_c  ci jSEBP −1 ˆySEBP i j   ci jSEBP −1 ex T i jˆβ+ 1 2  z2 i jˆτi2+ˆσe2 

Appendix 2: The MSE of

ˆµ

S E B P_ci The MSE of ˆμiS E B P_cis given by:

MSE(ˆμSEBP_ci ) E  ˆμSEBP_c i − μi 2  EˆμSEBP_c i − ˆμ SBP_c i + ˆμ SBP_c i − μi 2  EˆμSEBP_c i − ˆμ SBP_c i ) + (ˆμ SBP_c i − μi 2  EˆμSBP_c i − μi 2 + E  ˆμSEBP_c i − ˆμ SBP_c i 2 + 2E  ˆμSEBP_c i − ˆμ SBP_c i   ˆμSBP_c i − μi  (A2.1) where ˆμSEBP_c i  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri ˆySEBP_c i j⎦ ; ˆySEBP_c i j   ci jSEBP −1 ex T i jˆβ+ 1 2  z2 i jˆτi2+ˆσe2  ˆμSBP_c i  1 Ni ⎡ ⎣ j∈si yi j+  j∈ri ˆySBP_c i j⎦ ; ˆySBP_c i j   cSBPi j −1 ex T i jˆβ+ 1 2  zi j2τi2+σe2

Note: cSEBPi j has been derived in (A1.4) whereas cSBPi j is similar to cSEBPi j butη  

ρ, σ2

v,σe2



is assumed to be known. The first term of (A2.1), E

 ˆμSBP_c

i − μi

2

, can be written as:

E  ˆμSBP_c i − μi 2  VarˆμSBP_c i   Var ⎧ ⎨ ⎩Ni−1 ⎡ ⎣ j∈si yi j +  j∈ri ˆySBP_c i j ⎤ ⎦ ⎫ ⎬ ⎭ (A2.2)

Note: the first part,

j∈si

yi j, which is calculated using sample data values, is

uncorre-lated with the second part 

j∈ri ˆyS B P_c

(19)

E  ˆμSBP_c i − μi 2  Var ⎧ ⎨ ⎩Ni−1  j∈si yi j ⎫ ⎬ ⎭+ Var ⎧ ⎨ ⎩Ni−1  j∈ri ˆySBP_c i j ⎫ ⎬ ⎭  Var ⎧ ⎨ ⎩Ni−1  j∈si yi j ⎫ ⎬ ⎭+ Ni−2 ⎧ ⎨ ⎩Var  j∈ri ˆySBP_c i j ⎫ ⎬ ⎭ (A2.3) The first term of (A2.3) Var

)

Ni−1

j∈si

yi j

*

would be obtained using sample data values.

The moment generating function for Y log Y which follows normal distribution with meanμ and variance σ2is given by:

M(t)  E  et Y∗   eμt+12t2σ2 or M(t)  E  et Y∗   et xi jTβ+12t2  z2 i j τi2+σ 2e  E  yi jyi k∗   Exi jTβ + zi jui   xi kTβ + zi kui 

, (note: It is the second moment of lognormal distribution) E  yi j∗   exi jTβ+ 1 2  z2 i jτi2+σe2  Eyi k∗ exTi kβ+12  zi k2τi2+σe2  cov  yi j, yi k∗   Eyi jyi k∗  − Eyi j∗  Eyi k∗   Exi jTβ + zi jui   xi kTβ + zi kui  − exi jTβ+ 1 2  z2i jτi2+σ2 e  exi kTβ+12  z2i kτi2+σe2 E  yi j∗2   e2xi jTβ+2  z2i jτi2+σe2  var  yi j∗   Eyi j∗2  −E  yi j∗ 2  e2xi jTβ+2  z2i jτi2+σe2  − e2xi jTβ+  z2i jτi2+σe2 

The second term of (A2.3) is given by:

Ni−2 ⎧ ⎨ ⎩Var  j∈ri ˆyS B P_c i j ⎫ ⎬ ⎭ Ni−2 ⎧ ⎨ ⎩  j∈ri VarˆyS B P_c i j  + 2  j<k∈ri  covˆyS B P_c i j ,ˆy S B P_c i k ⎫⎬ ⎭  Ni−2 ⎧ ⎨ ⎩  j∈ri VarcS B P i j −1 ˆyS B P i j  + 2  j<k∈ri  covcS B P i j −1 ˆyS B P i j ,  cS B P i j −1 ˆyS B P i k ⎫ N−2 i ⎧ ⎨ ⎩  cS B P i j −2 (Ni−ni) j1 e2x T i jˆβ+  z2 i jτ 2 i+σ 2 e  e  z2 i jτ 2 i+σ 2 e  − 1  +2ci jS B P−2  1≤ j<k≤(Ni−ni)

covˆyi jS B P,ˆyi kS B P ⎫ ⎬

(20)

 N−2 i ⎧ ⎨ ⎩  ci jS B P −2 (Ni−ni) j1 e2x T i jˆβ+  z2 i jτi2+σe2  e  z2 i jτi2+σe2  − 1  +2  ci jS B P −2  1≤ j<k≤(Ni−ni) E  xi jTˆβ + zi jui   xi kT ˆβ + zi kui  − exT i jˆβ+ 1 2  z2 i jτi2+σe2  exi kTˆβ+12  z2 i kτi2+σe2 ⎫⎬ ⎭,

Note that ˆμSBP_ci depends onσv2,σe2andρ which are usually unknown. The ˆμSEBP_ci is obtained fromˆμSBP_ci by replacingσv2,σe2andρ by their estimates. The cross-product term in (A2.1) E  ˆμSEBP_c i − ˆμ SBP_c i   ˆμSBP_c i − μi 

is zero. See Rao and Molina (2015) for the proof that E

 ˆμSBP_c i − ˆμ SBP_c i   ˆμSBP_c i − ˆμ SBP_c i μi   0. Using a Taylor approximation, ˆμSEBP_ci can be written by:

(A2.4) ˆμSEBP_c i − ˆμ SBPc i   ˆη − η ∂ ∂η  ˆμSBPc i  ; η β, ρ, σ2 v,σe2  and ˆη ˆβ, ˆρ, ˆσ2 v,ˆσe2 

By squaring on the left-hand side and the right-hand side of A2.4, and then taking the expectation on the both side, we will obtain:

E  ˆμSEBP_c i − ˆμ SBP_c i 2 ≈ Eˆη − η ∂ ∂η  ˆμSBP_c i 2  E ⎧ ⎨ ⎩  ˆη − η ∂ ∂η⎝N−1 i  j∈si yi j+ Ni−1  j∈ri ˆySBP_c i j ⎞ ⎠ ⎫ ⎬ ⎭ 2  E ⎧ ⎨ ⎩  ˆη − η ⎧ ⎨ ⎩ ∂η⎝N−1 i  j∈ri ˆySBP_c i j ⎞ ⎠ ⎫ ⎬ ⎭ ⎫ ⎬ ⎭ 2 , Note: ∂η⎝N−1 i  j∈si yi j ⎞ ⎠  0  E ⎧ ⎨ ⎩  ˆη − η ⎧ ⎨ ⎩Ni−1 ⎛ ⎝ j∈ri ∂ηˆy SBP_c i j ⎞ ⎠ ⎫ ⎬ ⎭ ⎫ ⎬ ⎭ ⎧ ⎨ ⎩  ˆη − η ⎧ ⎨ ⎩Ni−1 ⎛ ⎝ j∈ri ∂ηˆy SBP_c i j ⎞ ⎠ ⎫ ⎬ ⎭ ⎫ ⎬ ⎭  E ⎧ ⎪ ⎨ ⎪ ⎩  ˆη − η ˆη − ηT ⎧ ⎨ ⎩Ni−1 ⎛ ⎝ j∈ri ∂ηˆyi jSBP_c ⎞ ⎠ ⎫ ⎬ ⎭ ⎧ ⎨ ⎩Ni−1 ⎛ ⎝ j∈ri ∂ηˆyi jSBP_c ⎞ ⎠ ⎫ ⎬ ⎭ T ⎬ ⎪ ⎭  tr ⎧ ⎪ ⎨ ⎪ ⎩ ⎡ ⎣V  ˆρ 0 0 0 Vˆσv2 Covˆσv2, ˆσe2 0 Covˆσv2, ˆσe2 Vˆσe2 ⎤ ⎦ ⎡ ⎣N −1 i A Ni−1B Ni−1C ⎤ ⎦ ⎡ ⎣N −1 i A Ni−1B Ni−1C ⎤ ⎦ T ⎬ ⎪ ⎭

(21)

Note: ∂η ˆyi jSBP_c ⎡ ⎢ ⎢ ⎣ ∂ρ ˆyi jSBP_c ∂σ2 v ˆy SBP_c i j ∂σ2 e ˆy SBP_c i j ⎤ ⎥ ⎥ ⎦  ⎡ ⎣BA C ⎤ ⎦ EˆμSEBP_ci − ˆμSBP_ci 2≈ Ni−2tr ⎧ ⎨ ⎩ ⎡ ⎣V  ˆρ 0 0 0 Vˆσv2 Covˆσv2, ˆσe2  0 Covˆσv2, ˆσe2  Vˆσe2 ⎤ ⎦ ⎡ ⎣ A 2 A B AC A B B2 BC AC BC C2 ⎤ ⎦ ⎫ ⎬ ⎭ ≈ Ni−2  A2Vˆρ+ B2Vˆσv2+ 2BCCovˆσv2, ˆσe2  + C2Vˆσe2 ≈ Ni−2  A2Vˆρ+∇B2ˆσv2+ Cˆσe2  A ∂ρ  j∈ri ˆySBP_c i j   j∈ri ∂ρˆySBP_ci j  j∈ri  ˆySBP_c i j 1 2z 2 i j  bTiσv2I− ρWT(I − ρW)−1 ×WT+ W− 2ρWTW   I− ρWT  (I − ρW)−1bi  B ∂σ2 v  j∈ri ˆySBP_c i j   j∈ri ∂σ2 v ˆy SBP_c i j   j∈ri ˆySBP_c i j 1 2z 2 i j  bTi I− ρWT(I − ρW)−1bi  C ∂σ2 e  j∈ri ˆySBP_c i j   j∈ri ∂σ2 e ˆySBP_c i j   j∈ri 1 2ˆy SBP_c i j

Finally, the MSE of ˆμS E B P_ci is: MSEˆμSEBP_ci  ≈ Var ⎧ ⎨ ⎩Ni−1  j∈si yi j ⎫ ⎬ ⎭+ Ni−2 ⎧ ⎨ ⎩ (Ni−ni) j1 e2x T i jβ+  z2 i jτi2+σe2  e  z2 i jτi2+σe2  − 1 + 2  1≤ j<k≤(Ni−ni) Exi jTβ + zi jui   xi kTβ + zi kui  − exT i jβ+12  z2 i jτi2+σe2  exTi kβ+12  z2 i kτi2+σe2 ⎫⎬ ⎭ + Ni−2  A2Vˆρ+∇  B2ˆσv2+ Cˆσe2  ;

where∇ (.) is asymptotic variance–covariance matrix of total variability of response.

References

Asfar, Kurnia A, Sadik K (2016) Optimum spatial weighted in small area estimation. Glob J Pure Appl Math 12(5):3977–3989

Bellow ME, Lahiri PS (2011) An empirical best linear unbiased prediction approach to small area estimation of crop parameters. In Section on survey research methods, pp 3976–3986

Berg E, Chandra H (2014) Small area prediction for a unit-level lognormal. Comput Stat Data Anal 78:159–175

Chandra H, Chambers R (2011) Small area estimation under transformation to linearity. Surv Methodol 37:39–51

(22)

Jiang J (1996) REML estimation: asymptotic behaviour and related topics. Ann Stat 24:256–286 Karlberg F (2000) Population total prediction under a lognormal superpopulation model. Metron, LVIII, pp

53–80

Kurnia A, Chambers R (2011) Small area inference for positively skewed distributions. In: The proceeding of the 6-th SEAMS-GMU international conference on mathematics and its applications, July 12–15, 2011, Yogyakarta

McCulloch CE, Searle SR (2001) Generalized, linear and mixed models. Wiley, New York

Molina I, Salvati N, Pratesi M (2009) Bootstrap for estimating the MSE of the spatial EBLUP. Comput Stat 24:441–458

Petrucci A, Salvati N (2004a) Small area estimation using spatial information, The Rathbun lake watershed case study. Working Paper no 2004/02, “G. Parenti” Department of Statistics, University of Florence Petrucci A, Salvati N (2004b) Small area estimation considering spatially correlated errors: the unit level random effects model. Working Paper no 2004/10, “G. Parenti” Department of Statistics, University of Florence

Petrucci A, Salvati N (2006) Small area estimation for spatial correlation in watershed erosion assesment. J Agric Biol Environ Stat 11:169–182

Prasad NGN, Rao JNK (1990) The estimation of mean squared errors of small area estimators. J Am Stat Assoc 85:163–171

Pratesi M, Salvati N (2008) Small area estimation: the EBLUP estimator based on spatially correlated random effects. Stat. Methods Appl 17:113–141

Rao JNK, Molina I (2015) Small area estimation. Wiley, New York

Salvati N (2004) Small area estimation by spatial models: the spatial empirical best linear unbiased prediction (spatial EBLUP). Working Paper no 2004/03, “G. Parenti” Department of Statistics, University of Florence

Slud EV, Maiti T (2006) Mean-squared estimation in transformed Fay–Herriot models. J R Stat Soc B 68:67–72

Wang J, Fuller WA (2003) The mean squared error of small area predictor constructed with estimated area variances. J Am Stat Assoc 98:716–745

Referenties

GERELATEERDE DOCUMENTEN

Transfer learning allows to personalize heart rate based seizure detection in a fast and robust way by using only a lim- ited amount of annotated patient-specific data.. The false

In order to answer the first research question, data from the System of social statis- tical datasets (SSD) of Statistics Netherlands (CBS) were analyzed.. 114 | Cahier 2016-6

• Ensure participation of all stakeholders in an investigation of the processes of erecting a new police station (not SAPS and CPF only) namely: relevant government

1 .The average time delays, weighted by the exposure of individual observation periods (as represented on Fig. SN 2016adj was observed every night from day 3 till day 10 after

From the above mentioned studies we distilled the following clusters of relevant factors in assessing if a target firm should (not) be protected in a hostile takeover situation:

The more appropriate comparison for the regularization dependence of the OTOC is the proof in Schwinger-Keldysh theory that physical correlation functions are independent on the

The model illustrates that the product information (i.e. price, proportion), brand equity (i.e. brand knowledge, brand loyalty) and the situational involvement (high, low) have

In Figure 4a-f, the reaction rates for the propene oxide formation, as well as the deactivation and reactivation rate constants, are shown as a function of the hydrogen, oxygen,