Investigating spatial non-stationary environmental effects on the distribution of giant pandas in the Qinling Mountains, China

(1)

Original Research Article

Investigating spatial non-stationary environmental effects on

the distribution of giant pandas in the Qinling Mountains,

China

Xinping Ye

a,b

, Xiaoping Yu

a,b

, Tiejun Wang

c,*

a_{College of Life Sciences, Shaanxi Normal University, Xi’an, 710119, China}

b_{Research Center for UAV Remote Sensing, Shaanxi Normal University, Xi’an, 710119, China}

c_{Department of Natural Resources, Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, P.O. Box 217,}

7500 AE, Enschede, the Netherlands

a r t i c l e i n f o

Article history:

Received 24 August 2019

Received in revised form 20 December 2019 Accepted 20 December 2019

Keywords:

Spatial heterogeneity

Species-environment relationship Geographically weighted logistic regression Global logistic regression

Giant panda

a b s t r a c t

Analyses of species distribution have commonly been performed with global regression models by assuming species-environment interactions are spatially stationary. However, environmental variables are often spatially heterogeneous and their effects on species distribution may vary across space. Here we employed a geographically weighted logistic regression (logistic GWR) to investigate environmental effects on the distribution of giant pandas (Ailuropoda melanoleuca) in the Qinling Mountains of China. Outputs from the logistic GWR were compared with those derived from a global logistic regression in pre-dicting panda distribution. A k-means cluster analysis was used to identify distinct zones of panda-environment relationships. We found that logistic GWR outperformed global gistic regression in terms of goodness-of-ﬁt and predictive accuracy. Results from the lo-gistic GWR model clearly showed both the strength and direction of the environmental effects on panda distribution changed spatially and formed distinct subareas with particular panda-environment relationships. Theﬁndings emphasize the importance of considering spatial non-stationarity in studying ecological relationships between organ-isms and their environments, especially for threatened species such as the giant panda with small populations in highly fragmented habitats.

1. Introduction

Understanding of species-environment relationships in heterogeneous landscapes is essential for biodiversity con-servation and landscape management (Wu et al., 2000;Turner and Tjørve, 2005). Conventional regression models, such as multivariate linear regression (Radeloff et al., 1999), ordinary least-squares regression (Coppolillo, 2000), and logistic regression (Pereira and Itami, 1991;Augustin et al., 1996), are the most common statistical methods used for estimating the ecological relationships between organism and environmental covariates (Windle et al., 2009;Tavernia and Reed, 2012). These models postulate that the relationships between species’ measures and environmental covariates are spatially stationary over space, and produce constant regression coefﬁcients through the studied area (Fotheringham

* Corresponding author.

E-mail address:t.wang@utwente.nl(T. Wang).

Contents lists available atScienceDirect

Global Ecology and Conservation

j o u r n a l h o m e p a g e :h t t p : / / w w w . e l s e v i e r . c o m / l o c a t e / g e c c o

https://doi.org/10.1016/j.gecco.2019.e00894

(2)

et al., 1996, 2002;Mcnew et al., 2013). However, ecological processes in a real landscape are likely to spatially hetero-geneous so that spatial non-stationarity emerges (i.e., the regression coefficients vary spatially across areas), bringing about the difficulty for accurate identification of key environmental factors affecting ecological processes of interest (Mitchell et al., 2001; Miller and Hanham, 2011). In such circumstances, models that assume spatially constant re-lationships (i.e., global regression models) may misinterpret the true factors contributing to underlying spatial patterns of the distribution of organisms and consequently, lead to biased inference and ineffective management (Mitchell et al., 2001; Fotheringham et al., 2002; Mcnew et al., 2013). Therefore, it is critical to investigate whether the effects of environmental factors are spatially heterogeneous before making model inferences over wide areas (Fortin et al., 2006). To improve the power of models in assessing ecological relationships, a few techniques have been developed to ac-count for unexplained spatial dependence in data by adding spatially correlated random effects in standard models, such as spatial autoregressive model (Lichstein et al., 2002;Miller, 2005), spatial spline model (Sangalli et al., 2013), and spatial gaussianfields model (Lindgren et al., 2011; also seeBakka et al., 2019;Martínez-Minaya et al., 2019). These approaches, although promising to capture spatial patterns in the covariates’ effects, can be strongly influenced by departures from the underlying distributions, while the estimated regression coefficients are still spatially invariant (i.e., ‘global’ co-efficients) and may overlook potentially influential variables affecting ecological processes of interest (Loucks and Wang, 2004). More recently, a spatial statistical method termed geographically weighted regression (GWR) has been introduced as a powerful method to address spatial heterogeneity in ecological relationships (Brunsdon et al., 1996). GWR extends traditional regression models by allowing each covariate’s parameters to vary among different locations (Brunsdon et al., 1996;Fotheringham et al., 2002). Unlike non-spatial regression models that compute only one set of global coef_ficients, GWR calculates local coefficients for each observation by weighting neighboring observations with a decreasing function of distance (Zhang and Shi, 2004). Such location-specific coefficients can explicitly show the pattern of spatially-varying species-environment relationships over the areas of interest. GWR has been successfully applied in different studies such as evaluating spatially explicit vegetation-environment relationships (Bickford and Laffan, 2006;Kupfer and Farris, 2007), detecting spatial heterogeneity in avian nest-site selection (Mcnew et al., 2013), investigating spatial determinants of infectious disease (Hu et al., 2012), and detecting links between human leptospirosis and hydrological dynamics ( Vega-Corredor and Opadeyi, 2014). In this context, GWR could also be a promising tool for understanding the spatial nature of wildlife distribution and how they vary spatial-temporally in heterogeneous landscapes (Shi et al., 2006).

The giant panda (Ailuropoda melanoleuca) is a threatened and iconic conservation species, and panda-habitat relationships have been the focal interest of ecological research on giant pandas (Wei et al., 2015). Previous studies have applied a variety of statistical methods (e.g., generalized linear regression (Wang et al., 2010;Kang et al., 2014), discriminate function analysis (Ye et al., 2007), and ecological niche factor analysis (Qi et al., 2012)) to estimate the environmental effects on giant pandas and found that the spatial distribution of giant pandas is regulated by topographic conditions, vegetation fragmentation, and human disturbances (Hull et al., 2014). Although technical details differ among these modeling approaches, they are aspatial and seldom consider the spatial non-stationarity in analyzing panda-habitat relationships (Yackulic and Ginsberg, 2016). If the underlying scientiﬁc assumptions about the ecological needs of giant pandas are incorrect, the resulting habitat models would misguide panda conservation and management (Swaisgood et al., 2010;Zhang et al., 2011). Considering the inability of global regression models to capture spatial variations in ecological relationships, there is an urgent need to reassess the conventional wisdom about panda’s habitat requirements. From this perspective, GWR could be an ideal analytical means to improve our understanding of giant pandas’ ecological needs in variable space. However, to our knowledge, no study has employed the GWR technique to investigate spatial non-stationarity in associations between environmental factors and ginat panda distribution and habitat use.

In this study, we analyzed the relationships between panda distribution and environmental factors in the Qinling Mountains of central China, using an extension of the GWR technique called geographically weighted logistic regression (hereafter logistic GWR). Speciﬁcally, we set out to (1) compare the performances of logistic GWR with commonly-used global logistic regression in modeling the distribution of giant pandas; (2) examine spatial variability in the effects of environmental factors on panda distribution; (3) identify the distinct zones of panda-environment relationships over the studied area.

2. Material and methods 2.1. Study area

The study area (32500- 34100N, 106250- 108500 E) extends 400e500 km in the east-west direction through the southern part of Shaanxi Province and covers the whole distribution of giant pandas in the Qinling Mountains, China (Fig. 1). Acting as a boundary between northern China and southern China, the broad, undulating southern slopes capture the warm rains and moisture from the southeastern monsoon and are dominated by evergreen broad-leaved and mixed-deciduous broadleaved forests. According to the fourth national giant panda survey, around 340 giant pandas inhabit the forests in this mountainous region, and most of them are concentrated on the southern slope of the Qinling Mountains, which are found at the northern edge of the giant panda distribution (Shaanxi Provincial Forestry Department, 2017). The forests, as well as panda habitats, are geographically isolated from the other giant panda mountain range in China (State Forestry

(3)

Administration of China, 2006). Human density and road density vary considerably across the study area, while local residents mostly inhabit in low altitude zones.

2.2. Panda distribution data

Giant panda occurrence points (n¼ 294) were derived from the database of the Third National Giant Panda Survey carried out in 1999e2001 (State Forestry Administration of China, 2006). The survey was conducted via an exhaustive dragnet investigation throughout the study area (Loucks and Wang, 2004). Due to the lack of true absence data, we generated panda pseudo-absences by extracting 500 random points within forest areas but outside 3-km buffer zones around giant panda occurrence locations (Wang et al., 2010). To reduce spatial autocorrelation in the distribution data, both presence and pseudo-absence locations were spatiallyﬁltered using a distance of 2 km (Wang et al., 2010) using the “SDMtool box” toolkit (Brown, 2014) in ArcGIS software (ESRI, v. 10.2). The resulting data, containing 135 panda presence and 135 pseudo-absence locations, was then used as the binary response variable for further modeling of panda distribution.

2.3. Environmental variables

Wefirst developed a preliminary set of potential predictor variables (seeTable A1) that have been commonly used in previous panda habitat studies in the study area (e.g.,Feng et al., 2009;Wang et al., 2010;Hull et al., 2014). To reduce the multicollinearity of explanatory variables(see Fig. A1), we statistically thinned the preliminary set of environmental variables by the‘corSelect’ function in the ‘fuzzySim’ R package (Barbosa, 2015). If the correlation coefficient of a pair of variables was >0.75 or the VIF values were >10 (Zuur et al., 2010), the variables were tested in a bivariate model and the one with a betterfit was retained. Thefinal set of the variables included topographic roughness, climate heterogeneity, mean forest patch size, distance between forest patches, distance to road, and distance to settlement (Table 1; also seeFig. A2). Topographic roughness and climate heterogeneity reflect geophysical and climatic conditions, while mean forest patch size and distance between forest patches depict the spatial patterns of forests, and distances to road and settlement measure the severity of human disturbances in the study area(Fig. A2).

2.4. Model development

We applied both global logistic regression and logistic GWR models to evaluate the spatial distribution of giant pandas. The global logistic regression model (i.e., binomial GLM with a logistic link) is a non-spatial method that ignores spatial variability and produces a single, spatially stationary coef_{ficient for each environmental variable (}Real et al., 2006). In contrast, the logistic GWR model is a local form of the logistic regression and produces location-specific coefficients for each variable which may spatially vary across the studied space (Brunsdon et al., 1996;Fotheringham et al., 2002). A typical logistic GWR model takes the following form:

Fig. 1. Map of the study area in the Qinling Mountains of central China. Presence (ﬁlled triangles) and pseudo-absence (plus signs) samples of giant pandas are extracted from the Third National Giant Panda Survey conducted in 1999e2001.

(4)

log " y*_i 1 y* i # ¼

b

0iþ X k

b

kixkiþ εi (1)

where y*_i is the prediction of the dependent variable y at observation i,

b

0iis the intercept speciﬁc to observation i, and

b

kiis the coefﬁcient for kth_{covariate at observation i, and x}

kiandεiare the kthcovariate and error at observation i, respectively. In a logistic GWR model, local variable coefficients for each observation are fitted via a distance-decay kernel weighting scheme that neighbors closer to the modeled observation are weighted more heavily than those further away (Fotheringham et al., 2002). The kernel bandwidth of the weighting function can be set to_{‘fixed’ or ‘adaptive’. A fixed kernel means the} bandwidth size keeps constant across space, while an adaptive kernel means the bandwidth size will adjust based on the number of neighbors around the focal observation (Fotheringham et al., 2002; Atkinson et al., 2003). In addition, the bandwidth size of the spatial kernel can affect modelfitting by controlling the variance in the weighting function. Narrow bandwidths result in highly localized and varied coefficient estimates, whereas broad bandwidths lead to local coefficients that are similar to global regression coefficients (Mcnew et al., 2013).

The same set of panda presence-absence data and environmental variables were used for global logistic regression and logistic GWR modeling. All environmental variables were standardized via z-transformation (Zuur et al., 2010) to facilitate direct comparison of their effects on giant panda distribution. Because the panda presence-absence samples were irregularly spaced in the study area, we adopted a Gaussian spatial weighting function with an adaptive kernel in the logistic GWR, where the optimum bandwidth size was determined by minimizing the corrected Akaike Information Criterion (AICc), as suggested byFotheringham et al. (2002).

2.5. Model evaluation and comparison

The performances of logistic GWR and global logistic regression models were examined based on AICc and adjusted R2 (Miller, 2012). In general, the models are considered to have different goodness offit if the difference between the two models’ AICc values is greater than four (Saefuddin et al., 2012). We also compared the ability of each model to predict panda presence/absence based on the area under the receiver operating characteristic curve (AUC;Zou et al., 2007). Spatial autocorrelation of regression residuals was examined by the Moran’s I statistics (Moran, 1950). Furthermore, we calculated and compared each model’s classification accuracy (i.e., the percentage of correct predictions), the degree of model agreement (measured by Cohen_{’s Kappa coefficient), and the true skill statistic (TSS). All statistical analyses were} conducted using the‘gwrr’ package (Wheeler, 2007) and the‘GWmodel’ package (Lu et al., 2013) in the R environment (R Core Team, 2018).

2.6. Mapping spatial variability in panda-environment relationships

To explicitly show how environmental effects on giant pandas varied over space, we converted a set of local parameter estimates from logistic GWR, including local coefﬁcients, local R2_{values, and local residuals, to continuous surfaces via the} Inverse Distance Weighting (IDW) interpolation (Windle et al., 2009). We further conducted k-means clustering on logistic

Table 1

Description of environmental variables used in the modeling of giant panda distribution.

Variable name Description Source

Topographic heterogeneity

The measure of topographic ruggedness at a panda presence/ pseudo-absence location.

Extracted from a 30-m spatial resolution DEM from ASTER GDEM (Hook et al., 2001; downloaded from the USGS website), and resampled to 100-m spatial resolution using the STMtoolbox in ArcGIS.

Climate heterogeneity

The measure of climatic variation at a panda presence/pseudo-absence location.

Extracted from 30 arc-seconds bioclimatic variables (Hijmans et al., 2005; downloaded from the WorldClim website), and resampled to 100-m spatial resolution using the STMtoolbox in ArcGIS, Mean forest

patch size

The average area of the forest patches within a 5-km radius circle around a panda presence/pseudo-absence location.

Calculated on the National LULC 2000 data with a 30-m spatial resolution (Liu et al., 2002) using Fragstats v4.2 software, and resampled to 100-m spatial resolution.

Distance between forest patches

The nearest distance between the forest patches within a 5-km radius circle around a panda presence/pseudo-absence location.

Calculated on the National LULC 2000 data with a 30-m spatial resolution in Fragstats v4.2 software, and resampled to 100-m spatial resolution

Distance to road Distance to the nearest road from a panda presence/pseudo-absence location.

Calculated on the roadmap (1:100000) from National Geometrics Center of China, and resampled to 100-m spatial resolution in ArcGIS.

Distance to settlement

Distance to the nearest human settlement from a panda presence/ pseudo-absence location.

Calculated on the residential map (1:100000) from National Geometrics Center of China, and resampled to 100-m spatial resolution in ArcGIS.

(5)

GWR-derived local coefficients to detect specific zones of distinct panda-environment relationships (Windle et al., 2009). The number of clusters (k) was set a priori to 2 to 5, and the best number of clusters was estimated based on a gap statistic (Tibshirani et al., 2001;Windle et al., 2009;Liu et al., 2019). Mean regression coefficients were calculated for each cluster. All continuous surfaces of local parameters as well as the spatial patterns of the clusters were mapped in ArcGIS software (ESRI, v10.2).

3. Results

The results of our global logistic regression model suggested that the distribution of giant pandas was signiﬁcantly affected by topographic roughness, climate heterogeneity, distance between forest patches, distance to road, and distance to settle-ment (p< 0.05;Table 2). Among them, the distance to road had the strongest positive association with giant panda presence, while the distance between forest patches had the strongest negative effect on giant pandas, followed by climate hetero-geneity and topographic roughness (Table 2). However, mean forest patch size was less correlated with panda presence in the study area (p> 0.05).

The range of logistic GWR-derived local coefficients for each variable was much wider than that in the global logistic regression model (Table 3&Fig. 2). It is notable that climate heterogeneity and distance to settlement had both negative and positive coefficient values, exhibiting self-inconsistency in their effects on giant panda distribution through the study area (Table 3). Non-stationary effects of environmental variables were further revealed by the maps of their local co-efficients, as shown in Fig. 3. For instance, topographic roughness had a significant negative effect on giant panda presence in the central part of the study area, but the influence became weaker in directions east and west (Fig. 3a). In contrast, the influence of climate heterogeneity was strong in the eastern part but had a decreasing trend to the west part of the study area (Fig. 3b). Distance to road was the only variable that had a significant positive effect throughout the study area (Fig. 3e), while the distance to settlement had a weak negative effect on panda distribution in both western and eastern parts of the study area (Fig. 3f). The association between giant panda presence and mean forest patch size was not significant in almost the entire study area, except for a small part in the central-southern part of the study area (Fig. 3c).

The logistic GWR model showed a marked improvement over the non-spatial logistic regression model. A lower AICc value and higher adjusted R2and AUC values for the logistic GWR model (AICc¼ 230; local adjusted R2_{¼ 0.337e0.589;} AUC¼ 0.965) suggested that logistic GWR ﬁtted data better than the global logistic regression model. The map of logistic GWR-derived local R2showed that logistic GWR had greater explanatory power in the central part of the study area (Fig. 4). Moreover, the residuals of the logistic GWR model had much weaker spatial autocorrelation (Moran_{’s I ¼ 0.088;} p¼ 0.293) than that of the logistic regression model (Moran’s I ¼ 0.327; p ¼ 0.002). Regarding the percentage of correctly classiﬁed points, the overall percentage of success for the logistic GWR model was 89.7% with a Kappa value of 0.794, whereas the overall success for the logistic regression model was 73.8% with a much lower Kappa value of 0.475 (Table 4). The likelihood maps of giant panda presence obtained from both models show that giant pandas were more likely to occur in the central part of the study area (Fig. 5). However, the logistic regression model produced a spotty probability distribution and over-estimated the possibility of giant panda presence around the peripheral regions (Fig. 5a). In contrast, the prediction map of the logistic GWR model generally coincides with the spatial pattern of giant panda occurrences in the study area (Fig. 5b).

Results of the k-means cluster analysis on logistic GWR-derived local coefﬁcients were mapped inFig. 6. Based on the gap statistics, the best number of clusters for panda-environment relationships in the study area wasﬁve (k ¼ 5). When k ¼ 2, a western zone (Cluster 1) was differentiated from the rest (Cluster 2) by a positive relationship with climate heterogeneity (Fig. 6a;Table 5), while Cluster 2 was further divided into two groups at k¼ 3 because of the counter effects of mean forest patch size (inFig. 6b;Table 5). As k changed to 4 and 5, Cluster 4 (Fig. 6c) and Cluster 5 (Fig. 6d) were further differentiated from previous Cluster 1 and Cluster 3 respectively, largely due to the different effects of climate heterogeneity and distance to settlement (Table 5).

Table 2

Coefﬁcient estimates for the explanatory variables in the global logistic regression model of giant panda distribution.

Variable Coefﬁcient Std. error z-value p-value VIF

Intercept 0.098 0.160 0.616 0.538 e

Topographic roughness 0.466 0.175 2.658 0.008 1.147

Climate heterogeneity 0.545 0.188 2.898 0.004 1.344

Mean forest patch size 0.406 0.210 1.936 0.053 2.034

Dist. between forest patches 1.683 0.499 3.371 0.001 1.974

Distance to road 1.192 0.197 6.038 <0.001 1.265

(6)

4. Discussion

This study evaluated the capability of logistic GWR for characterizing the local relationships between giant panda dis-tribution and environmental variables in the Qinling Mountains of China. Consistent with prior GWR applications (e.g.,

Windle et al., 2009;Li et al., 2018), we found that logistic GWR was able to characterize how local environmental factors affect the distribution of giant pandas over the study area. Unlike the global logistic regression model that ignores local variability, GWR can explicitly delineate spatial variations in the significance and direction of the panda-environment relationships across space. The global logistic regression analysis suggested that topographic roughness, climate heterogeneity, distance between forest patches, and distance to settlements were significant factors affecting giant panda distribution, which are similar to the results of previous panda research with non-spatial models (e.g.,Wang et al., 2010;Kang et al., 2014). However, the GWR-derived estimates of these variables indicated that their effects were not significant in many subareas. For example, the effect of climate heterogeneity was strong in the eastern part but had a decreasing trend to the west part of the study area, indicating that giant pandas prefer more stable weather conditions as in the west part of the Qinling Mountains. We also found that giant pandas were sensitive to the distance to roads but generally insensitive to the size of forest patches throughout the study area, implying that road network was the critical factor limiting giant panda presence and more efforts should be made to reduce the impacts from road traffics. Moreover, the global model suggested that climate heterogeneity had a negative association and distance to settlement a positive association with giant panda distribution, yet the logistic GWR model indicated that they had both positive and negative relationships over the study area, a pattern that had not been reported by previous panda research. Thefindings may not come as a surprise since the eastern portion of the study area generally has a lower ruggedness as well as a higher density of human settlements than the western part does (Shaanxi Provincial Forestry Department, 2017). The major implication of changing coef_{ficients across space is that the strength of} environmental effects on giant panda distribution was non-stationary because of the spatial variability of the environmental variables, showing the importance of spatial heterogeneity in examining interactions between giant pandas and its envi-ronmental context.

Since ecological relationships are likely to vary intrinsically over space, incorporating spatial information into modeling can help to identify dominant drivers causing spatial inconsistency in species-environment relationships and therefore, greatly improve model predictability in a complex system (Saefuddin et al., 2012). In the present study, the logistic GWR model yielded a more realistic map of giant panda’s presence in comparison with the global logistic regression model, where the latter apparently over-estimated the probability of giant panda presence around the pe-ripheral areas which is not congruent with the actual distribution pattern of giant pandas. This can be expected because our study area covers about 50 thousand km2. In this situation, the non-spatial analytical methods that frequently used by

Table 3

Coefﬁcient estimates for explanatory variables in the logistic GWR model of giant panda distribution, including the overall percentage of negative (% e) and positive (%þ) values. The optimal bandwidth was 21 (number of nearest neighbors).

Variable Minimum Lower quartile Median Higher quartile Maximum %e %þ

Intercept 0.661 0.090 0.489 1.138 1.924 26.3 73.7

Topographic roughness 1.006 0.674 0.489 0.410 0.330 100.0 0.0

Climate heterogeneity 1.992 1.418 0.973 0.143 0.375 81.8 18.2

Mean forest patch size 0.835 0.468 0.306 0.232 0.104 100.0 0.0

Distance between forest patches 2.451 2.003 1.741 1.400 0.935 100.0 0.0

Distance to road 0.805 1.152 1.395 1.443 1.626 0.0 100.0

Distance to settlement 0.351 0.094 0.309 0.486 0.761 17.8 82.2

Fig. 2. Comparison of coefficient estimates for each environmental variable in global logistic regression and logistic GWR models. For the global logistic regression model, the mean coefficients and 95% CIs are presented. For the logistic GWR model, the coefficients are shown as median, IQR, minimum, maximum, and range.

(7)

decision-makers could not explore spatially varying relationships and induce biased inferences, and ultimately misguide practices in habitat restoration and designing effective reserve networks. Given the risks of using global parameters to model species distribution in a large heterogeneous region, we suggest future panda-habitat analysis should consider using the GWR technique to improve our understanding of the spatial ecology of giant pandas, for instance, by combining the global regression and GWR methods to improve the prediction of panda presence in heterogeneous landscapes.

The present study further underscores the importance of considering spatial non-stationarity when exploring the envi-ronmental niche of species across scales. When data are collected within a large spatial extent, it is anticipated that the shapes and strengths of relationships between variables in one or more subareas are different from“global” situations. Our k-means cluster analysis on GWR-derived parameters successfully identi_{fied five distinct zones of spatial associations between giant} panda distribution and environmental covariates, implying that using global logistic regression (with a single set of pa-rameters) to model species distribution in a large region could be problematic and may lead to ineffective conservation policies and practices. However, designing broad-scale approaches to panda conservation is commonly based on range-wide analyses of species. If we fail to identify the species-environment differentiations and its underlying mechanisms, there would be a danger of mismatch between policy decisions and species’ real needs. Therefore, we need to partition the region into smaller subareas andfit the model separately in each of them to avoid such issues. The GWR technique can be used to spatial partitioning of the data by characterizing distinct areas of unique spatial associations between environmental cova-riates across space, as noted byWimberly et al. (2008). This is particularly helpful for decision-makers or landscape managers to design reserve networks or functional management units (MU) for species at a landscape scale. Conventionally the MU boundaries for giant pandas were designed based on administrative boundaries and socioeconomic concerns (e.g.,

human-Fig. 3. Interpolated continuous surfaces of the logistic GWR-derived local coefficient estimates for variables (a) topographic roughness, (b) climate heterogeneity, (c) mean forest patch size, (d) distance between forest patches, (e) distance to road, and (f) distance to settlement. Filled circles denote the samples where the relationship between panda distribution and the variable were significant (p < 0.05), and unfilled circles the not significant samples.

(8)

inhabited or not). Such MUs are convenient for administrative management but may to some extent lack ecological signif-icance. When comparing thefive distinct zones characterized by the k-means cluster analysis (k ¼ 5) for giant pandas with current MUs planning for The Giant Panda National Park, three different clusters of panda-environment relationships were found within the core distribution areas of the Qinling Mountains. Therefore, a comprehensive analysis of unique zones and refinement of the current MUs with consideration of non-stationarity in ecological processes would greatly benefit the regional conservation of giant pandas.

In addition to the GWR method, there are a few other spatial modeling techniques that also deal with spatial dependency and non-stationarity.Paez et al. (2008)compared GWR with moving windows regression and moving windows kriging techniques and found that GWR performs best for interpolation. In general, kriging can incorporate systematic residual in-formation (i.e. error autocorrelation) to get an improved predictive model, while GWR is designed to model spatially het-erogeneous processes as well as locational effects. The present study demonstrated that logistic GWR not only had more

Table 4

Summary of model performances for logistic regression and logistic GWR models of giant panda distribution.

Model n Deviance AICc Pseudo R2 _AUC _P

Successa Kappa TSS

Global logistic regression 270 273 288 0.270 0.832 73.8 0.475 0.478

Logistic GWR 270 192 230 0.337e0.589 0.965 89.7 0.794 0.795

(9)

explanatory power but also identiﬁed spatial non-stationarity in relationships between giant pandas and their environments. Furthermore, GWR is‘GIS-friendly’ and can interact seamlessly with GIS platforms to spatially visualize the results such as local coefﬁcient estimates, predictions, residuals, etc. Spatial mapping of these statistics enables us to examine geographic “hotspots” in the data that would be missed in a global analysis (Kimsey et al., 2008), which further helps explore the critical relationships generating the patterns uncovered by GWR. In these regards, the GWR would be a promising tool in the analysis of the distribution of a species in heterogeneous landscapes.

While GWR performs well in spatial relationship analysis, there has been some controversy about whether GWR is appropriate for making inferences about multivariate spatial relationships (Paez et al., 2011).Fotheringham et al. (2002)

noted that GWR should be used with caution because geographical coordinates are the only information required to esti-mate local coefficients at unobserved locations.Li et al. (2018)mentioned that this method is not suitable for predicting species’ future distribution under substantial changes in environmental conditions. Because local regression coefficients in GWR are estimated based on the neighboring observations, attention should be given to potential collinearity in local co-efficients when interpreting ecological associations between species and their environment (Wheeler and Tiefelsdorf, 2005). Furthermore, sampling density can influence the performance of GWR (Chen et al., 2012;Ye et al., 2017), and sparse data points may not be sufficient to produce ecologically interpretable results because the estimation of local parameters requires large data quantities when applying GWR analysis (Fotheringham et al., 2002). Additionally, we acknowledge that the study would undoubtedly be refined with the inclusion of additional explanatory variables (e.g., food quantity and quality) affecting

Fig. 6. Mapped results of k-means cluster analyses of the local coefﬁcient estimates from the logistic GWR for giant panda distribution, (a) k ¼ 2, (b) k ¼ 3, (c) k¼ 4, and (d) k ¼ 5.

Table 5

Mean logistic GWR coefﬁcient estimates (s.d. in parentheses) for each cluster identiﬁed by the k-means cluster analyses. k Cluster n Topographic

roughness

Climate heterogeneity

Mean forest patch size

Distance between forest patches Distance to road Distance to settlement 2 1 91 0.414 (0.058) 0.002 (0.277) 0.396 (0.211) 1.304 (0.298) 1.043 (0.163) 0.322 (0.155) 2 1790.618 (0.174) 1.259 (0.391) 0.350 (0.150) 1.905 (0.282) 1.436 (0.073) 0.251 (0.316) 3 1 91 0.414 (0.058) 0.002 (0.277) 0.396 (0.211) 1.304 (0.298) 1.043 (0.163) 0.322 (0.155) 2 76 0.480 (0.072) 1.567 (0.274) 0.263 (0.089) 1.748 (0.143) 1.395 (0.039) 0.559 (0.112) 3 1030.720 (0.155) 1.032 (0.300) 0.414 (0.154) 2.021 (0.303) 1.467 (0.078) 0.024 (0.207) 4 1 64 0.398 (0.036) 0.087 (0.209) 0.273 (0.092) 1.133 (0.116) 0.967 (0.113) 0.401 (0.092) 2 76 0.480 (0.072) 1.567 (0.274) 0.263 (0.089) 1.748 (0.143) 1.395 (0.039) 0.559 (0.112) 3 1030.720 (0.155) 1.032 (0.300) 0.414 (0.154) 2.021 (0.303) 1.467 (0.078) 0.024 (0.207) 4 27 0.453 (0.079) 0.212 (0.308) 0.688 (0.088) 1.709 (0.178) 1.222 (0.118) 0.134 (0.103) 5 1 64 0.398 (0.036) 0.087 (0.209) 0.273 (0.092) 1.133 (0.116) 0.967 (0.113) 0.401 (0.092) 2 76 0.480 (0.072) 1.567 (0.274) 0.263 (0.089) 1.748 (0.143) 1.395 (0.039) 0.559 (0.112) 3 49 0.627 (0.148) 1.005 (0.262) 0.312 (0.102) 1.772 (0.243) 1.485 (0.098) 0.128 (0.144) 4 27 0.453 (0.079) 0.212 (0.308) 0.688 (0.088) 1.709 (0.178) 1.222 (0.118) 0.134 (0.103) 5 54 0.804 (0.106) 1.058 (0.331) 0.507 (0.133) 2.247 (0.117) 1.450 (0.048) 0.161 (0.152)

(10)

giant pandas’ distribution. Nevertheless, our relatively simple analysis shows that GWR is a promising tool for taking spatial heterogeneity into account in analyzing ecological relationships, which allows us to gain deep insight into the causes and consequences of spatial non-stationarity in ecological processes.

5. Conclusions

The study demonstrates that logistic GWR is preferable to global logistic regression in characterizing panda-environment relationships in the Qinling Mountains, China. Using the logistic GWR technique, we successfully delineated how the sig-nificance and direction of the relationships between giant panda distribution and environment variables varied across the study area. Specific zones with distinct panda-environment associations were also identified by the spatial clustering of logistic GWR-derived coefficients. The outputs of GWR are not just a reflection of the spatial variations of the covariates but provide important insights into the relative importance of the ecological drivers at local scales. In this regard, GWR could be a promising tool for identifying spatial-varying ecological relationships and providing useful information for species of con-servation concern in a heterogeneous landscape.

Declaration of competing interest

The authors declare that they have no known competingﬁnancial interests or personal relationships that could have appeared to inﬂuence the work reported in this paper.

Acknowledgments

This work was supported by the National Key Research and Development Program of China (grant number 2016YFC0503200) and the National Natural Science Foundation of China (grant number 31672310). We thank the Forestry and Grassland Administration of Shaanxi Province for providing the 3rd panda survey data. We are also grateful for the helpful comments from the editor and anonymous reviewers.

Appendix A. Supplementary data

Supplementary data to this article can be found online athttps://doi.org/10.1016/j.gecco.2019.e00894. Appendices

Table A.1

Preliminary set of potential predictor variables for the correlation testing prior to panda-habitat analysis.

Variable name Description Source

Altitude (m) Elevation at a panda presence/pseudo-absence location SRTM 90 m DEM (downloaded from the USGS website) Topographic roughness Measure of topographic ruggedness at a panda presence/

pseudo-absence location

Extracted from SRTM 90 m DEM using the STMtoolbox in ArcGIS Climate heterogeneity Measure of climatic variation at a panda

presence/pseudo-absence location.

Extracted from 30 arc-seconds bioclimatic variables (downloaded from the WorldClim website) using the STMtoolbox in ArcGIS Largest patch area (ha) Area of the largest forest patch within a 5 km radius of a

panda presence/pseudo-absence location.

Calculated on the 30-m National LULC 2000 data in Fragstats software

Number of patches Number of forest patches within a 5 km radius of a panda presence/pseudo-absence location.

Percentage of landscape occupied by forest (%)

Proportional abundance of forest in the landscape within a 5 km radius of a panda presence/pseudo-absence location.

Mean forest patch size (ha)

Mean size of forest patches within a 5 km radius of a panda presence/pseudo-absence location.

Distance between patches (km)

Nearest distance between forest patches within a 5 km radius of a panda presence/pseudo-absence location.

Distance to road (km) Distance to the nearest public road from a panda presence/ pseudo-absence location

Calculated on the 1:100000 roadmap from National Geometrics Center of China in ArcGIS

Distance to settlement (km)

Distance to the nearest human settlement from a panda presence/pseudo-absence location

Calculated on the 1:100000 residential map from National Geometrics Center of China in ArcGIS

(11)

Fig. A.1. Coefﬁcients of pairwise correlations for all environmental variables.

Fig. A.2. Maps of the six environmental variables used in the models. (a) topographic roughness, (b) climate heterogeneity, (c) mean forest patch size, (d) distance between forest patches, (e) distance to road, and (f) distance to settlement.

(12)

References

Atkinson, P.M., German, S.E., Sear, D.A., Clark, M.J., 2003. Exploring the relations between riverbank erosion and geomorphological controls using geographically weighted logistic regression. Geogr. Anal. 35, 58e82.

Augustin, N.H., Mugglestone, M.A., Buckland, S.T., 1996. An autologistic model for the spatial distribution of wildlife. J. Appl. Ecol. 33, 339e347. Bakka, H., Vanhatalo, J., Illian, J.B., Simpson, D., Rue, H., 2019. Non-stationary Gaussian models with physical barriers. Spatial Stat. 29, 268e288. Barbosa, A.M., 2015. fuzzySim: applying fuzzy logic to binary similarity indices in ecology. Methods Ecol. Evol. 6, 853e858.

Bickford, S.A., Laffan, S.W., 2006. Multi-extent analysis of the relationship between pteridophyte species richness and climate. Glob. Ecol. Biogeogr. 15, 588e601.

Brown, J.L., 2014. SDMtoolbox: a python-based GIS toolkit for landscape genetic, biogeographic and species distribution model analyses. Methods Ecol. Evol. 5, 694e700.

Brunsdon, C., Fotheringham, A.S., Charlton, M.E., 1996. Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr. Anal. 28, 281e298.

Chen, G., Zhao, K.G., McDermid, G.J., Hay, G.J., 2012. The inﬂuence of sampling density on geographically weighted regression: a case study using forest canopy height and optical data. Int. J. Remote Sens. 33, 2909e2924.

Coppolillo, P.B., 2000. The landscape ecology of pastoral herding: spatial analysis of land use and livestock production in east africa. Hum. Ecol. 28, 527e560. Feng, T.-T., Manen, F.T.v., Zhao, N.-X., Li, M., Wei, F.-W., 2009. Habitat assessment for giant pandas in the qinling mountain region of China. J. Wildl. Manag.

73, 852e858, 7.

Fortin, M.J., Dale, M.R.T., Hoef, J.M.V., 2006. Spatial Analysis in Ecology.

Fotheringham, A.S., MartinCharlton, ChrisBrunsdon, 1996. The geography of parameter space: an investigation of spatial non-stationarity. Int. J. Geogr. Inf. Syst. 10, 605e627.

Fotheringham, A.S., Brunsdon, C., Charlton, M., 2002. Geographically Weighted Regression : the Analysis of Spatially Varying Relationships. John Wiley and Sons, West Sussex, England, United Kingdom.

Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G., Jarvis, A., 2005. Very high resolution interpolated climate surfaces for global land areas. Int. J. Climatol. 25, 1965e1978.

Hook, S.J., Myers, J.E.J., Thome, K.J., Fitzgerald, M., Kahle, A.B., 2001. The MODIS/ASTER airborne simulator (MASTER) - a new instrument for earth science studies. Remote Sens. Environ. 76, 93e102.

Hu, M., Li, Z., Wang, J., Jia, L., Liao, Y., Lai, S., Guo, Y., Zhao, D., Yang, W., 2012. Determinants of the incidence of hand, foot and mouth disease in China using geographically weighted regression models. PLoS One 7 e38978-e38978.

Hull, V., Roloff, G., Zhang, J., Liu, W., Zhou, S., Huang, J., Xu, W., Ouyang, Z., Zhang, H., Liu, J., 2014. A synthesis of giant panda habitat selection. Ursus 25, 148e162, 15.

Kang, D., Wang, X., Yang, H., Duan, L., Li, J., 2014. Habitat use by giant panda in relation to man-made forest in Wanglang Nature Reserve of China. Environ. Sci. Pollut. Control Ser. 21, 13440e13445.

Kimsey, M.J., Moore, J., Mcdaniel, P., 2008. A geographically weighted regression analysis of douglas-ﬁr site index in north central Idaho. For. Sci. 54, 356e366.

Kupfer, J.A., Farris, C.A., 2007. Incorporating spatial non-stationarity of regression coefﬁcients into predictive vegetation models. Landsc. Ecol. 22, 837e852. Li, B., Cao, J., Guan, L., Mazur, M., Chen, Y., Wahle, R.A., 2018. Estimating spatial non-stationary environmental effects on the distribution of species: a case

study from American lobster in the Gulf of Maine. ICES (Int. Counc. Explor. Sea) J. Mar. Sci. 75, 1473e1482.

Lichstein, J.W., Simons, T.R., Shriner, S.A., Franzreb, K.E., 2002. Spatial autocorrelation and autoregressive models in ecology. Ecol. Monogr. 72, 445e463. Lindgren, F., Rue, H., Lindstr€om, J., 2011. An explicit link between Gaussian ﬁelds and Gaussian Markov random ﬁelds: the stochastic partial differential

equation approach. J. R. Stat. Soc. Ser. B 73, 423e498.

Liu, C., Liu, J., Jiao, Y., Tang, Y., 2019. Exploring spatial nonstationary environmental effects on species distribution: a case study of Yellow Perch in Lake Erie. Peer J. 7 e27592v1.

Liu, J., Liu, M., Deng, X., Zhuang, D., Zhang, Z., Di, L., 2002. The Land-use and land-cover change database and its relative studies in China. J. Geogr. Sci. 12, 275e282.

Loucks, C., Wang, H., 2004. Panel Report 9.1: Assessing the Habitat and Distribution of the Giant PandaMethods and Issues, pp. 149e154. Lu, B., Harris, P., Gollini, I., Charlton, M., Brunsdon, C., 2013. GWmodel: an R Package for Exploring Spatial Heterogeneity.

Martínez-Minaya, J., Conesa, D., Bakka, H., Pennino, M.G., 2019. Dealing with physical barriers in bottlenose dolphin (Tursiops truncatus) distribution. Ecol. Model. 406, 44e49.

Mcnew, L.B., Gregory, A.J., Sandercock, B.K., 2013. Spatial heterogeneity in habitat selection: nest site selection by greater prairie-chickens. J. Wildl. Manag. 77, 791e801.

Miller, J., 2005. Incorporating spatial dependence in predictive vegetation models: residual interpolation methods. Prof. Geogr. 57, 169e184. Miller, J.A., 2012. Species distribution models:Spatial autocorrelation and non-stationarity. Prog. Phys. Geogr.: Earth Environ. 36, 681e692.

Miller, J.A., Hanham, R.Q., 2011. Spatial nonstationarity and the scale of specieseenvironment relationships in the Mojave Desert, California, USA. Int. J. Geogr. Inf. Sci. 25, 423e438.

Mitchell, M.S., Lancia, R.A., Gerwin, J.A., 2001. Using landscape-level data to predict the distribution of birds on a managed forest: effects of scale. Ecol. Appl. 11, 1692e1708.

Moran, P.A.P., 1950. Notes on continuous stochastic phenomena. Biometrika 37, 17e23.

Paez, A., Long, F., Farber, S., 2008. Moving window approaches for hedonic price estimation: an empirical comparison of modelling techniques. Urban Stud. 45, 1565e1581.

Paez, A., Farber, S., Wheeler, D., 2011. A simulation-based study of geographically weighted regression as a method for investigating spatially varying relationships. Environ. Plan. 43, 2992e3010.

Pereira, J.M.C., Itami, R.M., 1991. GIS-based habitat modeling using logistic multiple regression- A study of the Mt. Graham red squirrel. Photogramm. Eng. Remote Sens. 57, 1475.

Qi, D., Zhang, S., Zhang, Z., Hu, Y., Yang, X., Wang, H., Wei, F., 2012. Measures of giant panda habitat selection across multiple spatial scales for species conservation. J. Wildl. Manag. 76, 1092e1100.

R Core Team, 2018. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Radeloff, V.C., Pidgeon, A.M., Hostert, P., 1999. Habitat and population modelling of roe deer using an interactive geographic information system. Ecol. Model. 114, 287e304.

Real, R., Barbosa, A.M., Vargas, J.M., 2006. Obtaining environmental favourability functions from logistic regression. Environ. Ecol. Stat. 13, 237e245. Saefuddin, A., Setiabudi, N.A., Fitrianto, A., 2012. On comparison between logistic regression and geographically weighted logistic regression: with

application to Indonesian poverty data. World Appl. Sci. J. 19.

Sangalli, L.M., Ramsay, J.O., Ramsay, T.O., 2013. Spatial spline regression models. J. R. Stat. Ser. Soc. B Stat. Methodol. 75, 681e703.

Shaanxi Provincial Forestry Department, S., 2017. In:^ (Ed.), The Pandas of Shaanxi-The 4th Survey Report on Giant Pandas in Shaanxi Province, China. Shaanxi Science& Technology Press, Xi’an.

(13)

Shi, H., Laurent, Edward, Lebouton, J., Joseph, Racevskis, Laila, Hall, Kimberly, R., Donovan, 2006. Local spatial modeling of white-tailed deer distribution. Ecol. Model. 190, 171e189.

State Forestry Administration of China, S., 2006. The Third National Survey Report on Giant Pandas in China. Science Press, Beijing.

Swaisgood, R.R., Wei, F., Wildt, D.E., Kouba, A.J., Zhang, Z., 2010. Giant panda conservation science: how far we have come. Biol. Lett. 6, 143e145. Tavernia, B.G., Reed, J.M., 2012. The impact of exotic purple loosestrife (lythrum salicaria) on wetland bird abundances. Am. Midl. Nat. 168, 352e363, 12. Tibshirani, R., Walther, G., Hastie, T., 2001. Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B 63, 411e423. Turner, W.R., Tjørve, E., 2005. Scale-dependence in species-area relationships. Ecography 28, 721e730.

Vega-Corredor, M.C., Opadeyi, J., 2014. Hydrology and public health: linking human leptospirosis and local hydrological dynamics in Trinidad, West Indies. Earth Perspect. 1, 3.

Wang, T., Ye, X., Skidmore, A.K., Toxopeus, A.G., 2010. Characterizing the spatial distribution of giant pandas (Ailuropoda melanoleuca) in fragmented forest landscapes. J. Biogeogr. 37, 865e878.

Wei, F., Swaisgood, R., Hu, Y., Nie, Y., Yan, L., Zhang, Z., Qi, D., Zhu, L., 2015. Progress in the ecology and conservation of giant pandas. Conserv. Biol. 29, 1497e1507.

Wheeler, D., Tiefelsdorf, M., 2005. Multicollinearity and correlation among local regression coefﬁcients in geographically weighted regression. J. Geogr. Syst. 7, 161e187.

Wheeler, D.C., 2007. Diagnostic tools and A remedial method for collinearity in geographically weighted regression. Environ. Plan. 39, 2464e2481. Wimberly, M.C., Yabsley, M.J., Baer, A.D., Dugan, V.G., Davidson, W.R., 2008. Spatial heterogeneity of climate and land-cover constraints on distributions of

tick-borne pathogens. Glob. Ecol. Biogeogr. 17, 189e202.

Windle, M.J.S., Rose, G.A., Devillers, R., Fortin, M.-J., 2009. Exploring spatial non-stationarity ofﬁsheries survey data using geographically weighted regression (GWR): an example from the Northwest Atlantic. ICES (Int. Counc. Explor. Sea) J. Mar. Sci. 67, 145e154.

Wu, J., Jelinski, D.E., Luck, M., Tueller, P.T., 2000. Multiscale analysis of landscape heterogeneity: scale variance and pattern metrics. Geograp. Inf. Sci. 6, 6e19. Yackulic, C.B., Ginsberg, J.R., 2016. The scaling of geographic ranges: implications for species distribution models. Landsc. Ecol. 31, 1195e1208. Ye, H., Huang, W., Huang, S., Huang, Y., Zhang, S., Dong, Y., Chen, P., 2017. Effects of different sampling densities on geographically weighted regression

kriging for predicting soil organic carbon. Spatial Stat. 20, 76e91.

Ye, X., Yong, Y., Yu, C., Zhang, Z., 2007. Den selection by the giant panda in foping nature reserve, China. J. Nat. Hist. 41, 2529e2536. Zhang, L., Shi, H., 2004. Local modeling of tree growth by geographically weighted regression. For. Sci. 50, 225e244.

Zhang, Z., Swaisgood, R.R., Zhang, S., Nordstrom, L.A., Wang, H., Gu, X., Hu, J., Wei, F., 2011. Old-growth forest is what giant pandas really need. Biol. Lett. 7, 403.

Zou, K.H., O’Malley, A.J., Mauri, L., 2007. Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation 115, 654e657.