Nest site selection of muskrats in Flevoland

(1)

2016

Nest site selection of muskrats in Flevoland

BACHELOR THESIS

R.R. DE KONING - 10111085

MAJOR EARTH SCIENCES

UNIVERSITY OF AMSTERDAM

SUPERVISOR: E.E. VAN LOON

DATE: 04-07-2016

(2)

Abstract

Muskrat populations in the Netherlands pose a threat for our dikes. The digging activities of the muskrats can cause dikes to collapse. The cost of muskrat control in the Netherlands is over 34 million euros per year. Knowledge of favorable terrains and water characteristics, which influence the nest site selection of muskrats, could lower these costs. This research examines the relationship between the occurrence of muskrat burrows, riverbank aspect, riverbank steepness and water depth. Occurrence data was obtained through vangstnet registratie. The predictor variables were determined with the use of LiDAR-derived elevation models, aerial photographs and topographic maps of waterways. A relationship between the riverbank aspect and the occurrence of muskrats was found. No relationship was found between riverbank steepness or water depth. Logistic regression was used to create a prediction model. The best model was constructed with the riverbank aspect towards the south as a predictor variable. The overall accuracy of this model was 61.5 percent.

(3)

Table of Contents

Introduction ... 3

Research questions ... 4

Theoretical framework ... 5

Methods... 6

Obtaining the catch data ... 6

Bank steepness ... 7

Water depth ... 7

Bank aspect ... 8

Creating a prediction model ... 9

Evaluating the model ... 9

Results ... 11

Study area ... 11

Univariate analysis ... 12

Multivariate analysis ... 13

Five-fold cross validation ... 14

Threshold adjustment ... 14

Visualizing the prediction ... 15

Univariate versus multivariate model ... 17

Discussion ... 18

Conclusion ... 20

References ... 21

Appendix A, ROC-Curves. ... 22

Appendix B. Confusion matrices univariate model ... 23

Appendix C. Confusion matrices multivariate model ... 24

Appendix D. Confusion matrices 5-fold cross-validation. ... 24

Univariate model, aspect south. ... 24

Multivariate model, aspect east and south. ... 25

(4)

Introduction

The muskrat (Ondatra zibethicus) is a semiaquatic rodent, native to North America. They were imported to the European continent in the 1930’s for fur production (Heidinga, 2006). The muskrat is a popular species for the production of fur due to their fast reproducibility (Le Boulengé & Boulengé-Nguyen, 1981). This fast reproducibility is also one of the reasons why muskrat populations quickly dispersed all over Europe. Specifically the more humid northern part of Europe is an ideal habitat for the muskrat. There is an abundance of vegetation to forage and there are little to no natural predators present. The first muskrat in the Netherlands was detected in 1941 and nowadays muskrat populations can be found all over the Netherlands, except for the Wadden Islands Vlieland and Texel (Barends, 1998).

Muskrat populations are controlled in the Netherlands for several reasons. The main reason is to mitigate their digging activities. They build their nests in banks of lakes, rivers and ponds. The entrance of their burrow is often situated below the water level, leading to nest chambers in the riverbank above the water level. Muskrat burrows may exist of a large network of tunnels and chambers, which enhances instability of the embankments and can cause dikes to collapse (Van Vliet & Lengkeek, 2007). In the Netherlands, where about twenty-six percent of the land is situated below sea level (Benthem, 2015), maintaining the stability of dikes is a critical part of water management. Limiting the damage done by muskrats is therefore one of the maintenance tasks of the Dutch local water authorities. From 1986 on, muskrat populations have been actively controlled in the Netherlands (Heidinga, 2006), with yearly catches of over 400.000 muskrats in the peak years (figure 1). Muskrat catches have clearly declined over the last decade, but with their high reproducibility rate, active population control is still a necessity (Van Vliet & Lengkeek, 2007).

Figure 1. Yearly muskrat catches from 1950 till 2015.

The cost of muskrat control in the Netherlands is over 34 million euros per year (Van Vliet & Lengkeek, 2007). Several organizations, like the Dutch Society for the Protection of Animals, raise the question if these costs do not exceed the costs of repairing the damage done by the muskrats. Additionally, the

(5)

According to the vision of the Dutch Society for the Protection of Animals it is better to prevent damage. They propose several preventive measures. The first approach is the implementation of a physical barrier in the dikes, which makes it impermeable for muskrats. A second approach is creating nature friendly riverbanks, by adding an extra front bank. In this additional front bank, muskrats can build their burrows, but do not damage the actual dike. However adapting every riverbank in the Netherlands would lead to high economic costs (Van Vliet & Lengkeek, 2007).

The purpose of this research is to examine which terrain and water characteristics influence the selection of a riverbank for muskrats to locate their burrows. By investigating favorable riverbanks characteristics, preventive measures and muskrat control can be conducted more effectively. Knowledge of favorable nest site locations can reduce labor costs for muskrat population control, due to the fact that less man-hours will have to be spent on locating the muskrat burrows. Furthermore, knowledge of unfavorable nest site locations can lower the costs for implementing preventive measures, since unfavorable embankments do not need to be protected against damage. This research is carried out at the request of Waterschap Zuiderzeeland and will focus on the southern and eastern district of the province Flevoland.

Research questions

The main question of this research follows as:

- To what extend is it possible to predict the nest site locations of muskrat in Flevoland on the basis of terrain and water characteristics?

The following sub-question will be used as a support to answer the main question:

- Do bank aspect, bank steepness and water depth determine the occurrence of muskrat burrows? Before elaborating on how these questions will be answered first the theoretical framework behind the research will be discussed in the next chapter.

(6)

Theoretical framework

Muskrat populations can thrive in any aquatic biotope, provided that there is an abundance of vegetation to forage (Allen & Hoffman, 1984). In the Netherlands, their diet consists mainly out of rooted aquatic vegetation. In the summer they prefer cattail (Typha) and in the winter their diet consists mainly out of the roots of common reed (Phragmites)(Van Troostwijk, 1976). The vegetation cover of riverbanks is preferably 50 % or higher. (Allen & Hoffman, 1984). The muskrat prefers stagnant water with a level with little to no fluctuation (Ibid.). Most stagnant waters tend to be more shallow waterbodies. The ideal depth range for muskrats is between 0.46 and 1.20 meter (Errington, 1963). The preferred riverbank for building a burrow is presumably a steep riverbank facing the south, due the solar radiation.

The territory of a muskrat can extend to a diameter of 75 meter (Schuring et al., 2005) and their habitat can cover an area of several kilometers (Barends, 1987). Muskrats make use of several burrows within their habitat. According to Ahlers, Heske, Schooley and Mitchell (2010) a single muskrat can use up to twelve different burrows. A distinction can be made between smaller burrows used to take shelter when threatened and burrows that are used as a nest site location (figure 2).

For this research every located burrow will be appointed as a nest site location. Three different water and terrain characteristics will be examined to construct a prediction model.

Prediction models are often used for ecological purposes. There are many kinds of approaches but one of the most important is species distribution modelling (SDM). SDM extrapolates species distribution data in space, based on a statistical model. SDM’s are most often evaluated on their discrimination capacity. The most widely used criteria for this is the overall accuracy (Liu, White & Newell, 2009). The statistical as well as the empirical research methodology, will be explained in the following segment.

(7)

Methods

Obtaining the catch data

Muskrat control is carried out by the Dutch local water authorities. Data on muskrat control is stored in the vangstnet registratie, input for this is gathered by using the MuRa-app. Through this smartphone application, muskrat controllers store information regarding the whereabouts and effectiveness of their traps. With these data, insights can be given about the dispersion of muskrat populations. To determine nest site locations, a distinction needs to be made between the different kinds of traps. There are drowning cages, which float around just beneath the water surface at random locations. If a muskrat is caught in this trap, the muskrat controllers start to investigate this area by examining all the riverbanks for muskrat burrows. When a burrow is located, conibear traps are placed in the entrance of the burrow beneath the water level (figure 3).

For this research the location data of the conibear traps are being used, since these traps are placed at a true nest site location. The random placed traps do not provide any information on the occurrence of burrows. The catch data from 01-01-2014 till 01-01-2016 will be examined. This timeframe is chosen due to the fact that catch data before 2014 lacks the spatial information on the whereabouts of the conibear traps. This spatial data is crucial when researching which terrain and water characteristics influence the nest site selection of muskrats.

The data was obtained via the vangstnet registratie website and imported into ArcGIS. In the following section, an elaboration will follow concerning the methodology behind the determination of the terrain and water characteristics for all the data points.

(8)

Bank steepness

To determine the bank steepness a slope map of Flevoland was created in ArcGIS. This was extracted from the Algemeen Hoogtebestand Nederland (AHN2) (figure 5). With the use of the spatial analyst tool the slope for every data point was extracted in degrees between 0 and 90.

Figure 5. Slope map with muskrat burrows made in ArcGIS 10.2

Water depth

The depth for the water bodies in Flevoland was determined using the theoretical depths according to the maintenance obligation data of the Waterschap Zuiderzeeland. For every waterway this data is extracted from the database of Waterschap Zuiderzeeland containing topographic maps (figure 6). The water depth was assigned to every data point in centimeters.

(9)

Bank aspect

The bank aspect of the muskrat burrows was manually determined. For every data point the aspect of the riverbank towards the south and towards the east was determined using aerial photographs in ArcGIS (figure 7). A half circle protractor was used to determine the aspect. For the aspect towards the south the protractor was placed on top of the north line. The aspect was then determined from 0 to 180 degrees in both ways, with 0 degrees standing for directly facing the south (figure 8). The same procedure was done to determine the aspect towards the east but here the protractor was rotated 90 degrees and placed on a line perpendicular to the north line (figure 9).

Figure 8. Protractor used for determining degrees south. Figure 9. Protractor used for determining degrees east. Figure 7. Aerial photograph with muskrat burrows made in ArcGIS 10.

(10)

Creating a prediction model

To predict the nest site location of the muskrat a generalized linear model (GLM) will be used. This is the most commonly used model in species distribution modeling. The prediction model will be constructed according to a design I, sampling protocol C (Manly, McDonald, Thomas, McDonald, & Erickson, 2002). In this case-control design, two different samples are taken from the population, respectively the used and unused nest site locations. According to Meynard and Quinn (2007) a GLM is best suited for predicting species distribution in a case control design. In this research the response variable is occurrence or nonoccurrence. This can be explained by a binomial distribution with the ‘logit link’, a logistic regression model. The equation behind the generalized linear model is stated below:

With the logit link:

The coefficients in the generalized linear model can be interpreted as the increment or decrement of the probability of the occurrence of a muskrat burrow. For instance, in this research a negative coefficient for the degrees south and a positive coefficient for the riverbank steepness is expected.

Evaluating the model

The prediction model will be evaluated based on different criteria. For the evaluation of the classification, confusion matrices will be constructed. These will be used to determine the overall accuracy of the different constructed models.

An example of a confusion matrix can be seen in Table 1. It shows the predicted distribution versus the actual distribution. With this data, the sensitivity and the specificity of the model can be calculated. The sensitivity (TP/(TP+FN) is the proportion of actual occurrences that are accurately predicted and the specificity (TN/(TN+FP) is the proportion of actual non-occurrences that are accurately predicted. The calculated overall accuracy will be used to compare the quality of the constructed models.

Another criteria used to assess the model quality, will be the AIC (Akaike Information Criterion). AIC is a goodness of fit measure that favors smaller residual error in the model, but penalizes for including more predictors and prevents over fitting (Fielding & Bell, 1997).

(11)

Furthermore, ROC-curves (Receiver Operating Characteristic) will be generated to evaluate the quality of the model. With an ROC-curve the sensitivity is plotted against the false positive rate or 1-specificity. Figure 10 shows an example of different ROC-curves. The more it is aligned in the left top corner the better the classification model is. It can be seen as the divide between correctly predicted positive and correctly predicted negative. The bigger the area beneath the ROC-curve the better the model is. The area under the curve (AUC) will also be used for model quality evaluation. The AUC will be calculated and assessed for the complete dataset.

The ROC-curve or the relationship between sensitivity and specificity will be used to determine the threshold between determining whether to classify a calculated

p-value to presence (1) or absence (0). There are different criteria for choosing a threshold. For this research three different approaches will be compared. First, a fixed threshold of 0.5 will be used, which is default and often used in logistic regression (Manel et al., 1999). This will be compared to the thresholds at the point where sensitivity equals specificity (Fielding & Bell, 1997) and the point where sensitivity plus specificity is at its maximum (Manel et al., 2001). The differences will be examined by calculating the overall accuracy for the entire data set.

To compare the constructed prediction models, five-fold cross validation will be applied. With five-fold cross validation the data is divided in five different training and testing data set. For all five folds the data will be split into 80 percent training data and 20 percent testing data (figure 11). The sampling will be spatially stratified. For every fold a confusion matrix will be constructed to determine the overall accuracy.

In the next section, the results will be discussed. First, the studied area will be elaborated. Then, all the predictor variables will be separately analyzed. As a follow-up of the univariate analysis, the most relevant multivariate analyses will be examined. Subsequently, a comparison will be made between the constructed models; beginning with five-fold cross-validation, followed by threshold adjustment and ending with a visualization of the prediction of the models.

Figure 11. Example of constructed datasets in 5-fold cross-validation

(12)

Results

Study area

In total there were 194 Conibear traps with registered catches in the district of east Flevoland and 212 in the district of south Flevoland. These are all the registrations from 01-01-2014 till 01-01-2016. All the data points are shown in figure 12. The location directly across the riverbank will be used to determine the characteristics of the unused resource units. This accumulated to a total of 812 data points. For all these points, bank steepness, water depth and bank aspect were determined. This led to the following analysis:

(13)

Univariate analysis

The first step to creating a logistic regression model is a univariate analysis of the data. This is done to check the association of the individual explanatory variables with the dependent variable. The data was imported in matlab and the univariate fitted generalized linear model was calculated using the function fitglm with a binomial distribution. Table 2 shows the p-value for the relationship between all the single variables and the occurrence of a muskrat burrow.

Table 2. Overview of univariate analysis.

Variable Water depth Riverbank steepness Degrees east Degrees south

P-value 0.993 0.428 0.0056678 1.527e-10

Coefficient -7.1436e-06 0.012313 -0.0079679 -0.0083412

Exp(coefficient) 1.000 1.0124 0.9921 0.9917

The p-value below 0.05 for degrees east and degrees south indicates that there is a statistically significant association with the presence of a nest site location. The log-transformed coefficients show the relationship between the occurrence of a muskrat burrow and the variables. Concluding from this information, it can be stated that with every degree increasing towards the north, the chance of finding a muskrat burrows decreases with 0.83 percent. When comparing the prediction rate of degrees east and south, the overall accuracy of degrees

south is higher. The distribution of the aspect and the measured values are shown in figure 12. The overall accuracy was the highest at 61.5 percent with the south aspect as a predictor variable. The AUC for this model is 0.6306 and the AIC 1081.8.

The mean depth of the water in the observed area is 135.59 centimeters. The depth of the water bodies in the studied area is uniformly distributed. The p-value and the coefficient of the water depth show that this variable is counterproductive for predicting nest site locations.

Therefore, water depth will not be used for the multivariate analysis.

The mean riverbank steepness in the studied area is 26.62 degrees. The distribution of the riverbank steepness is very homogeneous. The coefficient of the riverbank steepness does indicate that there is a positive relationship between occurrence and steepness. However, the p-value is way above the threshold of 0.05, yet it still is worth examining whether it can contribute to the multivariate model.

(14)

Multivariate analysis

Concluding from the univariate analyses both degrees east and south seem to have a relationship with the presence of muskrat burrows. This led to the model shown in Table 3.

Table 3. Overview of multivariate analysis east & south.

Variable Coefficient P-value

Intercept 1.4398 2.159e-06

Degrees south -0.0082407 2.6931e-10

Degrees east -0.007758 0.01066

The p-values of all the variables indicate that there is a significant association between predictor variables and the response variable. To evaluate the model, the area under the ROC-curve was calculated in matlab. The AUC for this two-variable model is 0.6439 and the AIC for this model is 1082.3. A confusion matrix was then constructed to calculate the overall accuracy, which resulted for this model in 62.8 percent. This seems to be an improvement towards the univariate model. The next step was to find out whether adding the riverbank steepness improved the prediction model. An overview of the coefficients is given in Table 4.

Table 4. Overview of multivariate analyses east, south & steepness.

Variable Coefficient P-value

Intercept 1.0528 0.075276

Degrees south -0.008243 2.6931e-10

Degrees east -0.0077346 0.010917

Riverbank steepness 0.012184 0.44732

The p-value for riverbank steepness indicates that there is no significant association between steepness and the presence of muskrat nests. For comparison the AUC of this model is slightly lower with a value of 0.6426. On the contrary the overall accuracy has increased to 63.55 percent. The increased accuracy can be explained by over fitting. The increased AIC value of 1087.7 indicates the same. The riverbank steepness can be used to create a more accurate prediction model to describe the distribution of muskrat burrows in the observed population. However, it does not apply to other populations. It can therefore be concluded that the model where riverbank steepness is included seems to be a more accurate prediction model. However, the increased AIC and the decreased AUC combined with the high p-values indicate that adding riverbank steepness does not contribute to a better model.

(15)

Five-fold cross validation

To make a comparison between the multiple logistic regression model and the univariate logistic regression model, both models were five-fold cross-validated. This led to the following accumulated confusion matrices for both models:

Table 5. Confusion matrix cross-validation univariate. Table 6. Confusion matrix cross-validation multivariate.

The accuracy of the univariate model is overall better; it has a 61.7 percent overall accuracy against 58.5 percent overall accuracy of the multivariate model. After evaluating through cross-validation, the univariate model is more accurate; the following section will examine if this differs when the thresholds are adjusted.

Threshold adjustment

All confusion matrices above are constructed using the default threshold of 0.5. Does the accuracy change when different thresholds are applied? Two different approaches for determining the threshold will be used. Firstly, where sensitivity equals specificity and secondly, where the sum of the specificity and sensitivity is maximal. To determine this the perfcurve function was used in matlab. This function gives as output the sensitivity for each point of the ROC-curve and the false positive rate, or 1-specificity. It further provides the threshold of classifier scores for the computed values of the sensitivity and the false positive rate. Table 7 and 8 below show the output and the change in accuracy.

Table 7. Values threshold at point where sensitivity equals specificity.

Univariate model Multivariate model

Sensitivity = specificity 0.6158 0.6281

Threshold value 0.5042 0.5002

Change in accuracy 0 0

Table 8. Values threshold at point where sum of sensitivity and specificity is maximal.

Univariate model Multivariate model

Sensitivity + specificity = max 1.2414 1.2635

Threshold value 0.4771 / 0.525 0.4911 / 0.5093

Change in accuracy +0.49 +0.25

A slight improvement in the accuracy occurs when the threshold is adjusted to the value at which the sum of sensitivity and specificity is at its largest. But overall it can be concluded that all three examined methods for determining the threshold give roughly the same output. In the next chapter both models will be compared through a visualization of the predictions.

(16)

Visualizing the prediction

By visualizing the data, clustering in the predicted data can be inspected. For both models the predicted locations are shown in the maps below.

(17)

Figure 15. Overview true positive and true negative of multivariate model

(18)

Figure 17. Overview false positive and false negative of multivariate model

No clear clustering can be found in any of the models. It may seem as if there is some clustering appearing in the multivariate model but after more detailed inspection it can be stated that there is definitely no clustering occurring. No differences between the multivariate and the univariate model can be found after a visual comparison. In the following section, an overview will be given to determine which model is best suited as a prediction model.

Univariate versus multivariate model

Concluding from all comparisons made above between both models, there is little difference to be found between the two. When deciding on which of both is best suited to use as a prediction model, the univariate model has a higher overall accuracy after cross-validation. Therefore, it can be concluded that is has a better discrimination capacity. Subsequently, the degrees south is backed by the ecological theory that muskrats prefer building a nest site in a riverbank located towards the south. When comparing which method is best to set a threshold for determining absence or presence of a burrow, the accuracy is the highest at the threshold value where the sum of sensitivity and specificity is largest. Although it must be stated that the differences between the three approaches are very minimal. In the following segment the results of this research will be broadly discussed and related to previous and future research.

(19)

Discussion

Before answering the research questions a few remarks concerning this research need to be discussed. The first remark concerns the sampling protocol, which was used. For this research sampling protocol C was chosen. This states that unused resource units and used resource units are sampled independently (Manly, McDonald, Thomas, McDonald, & Erickson, 2002). This condition is more or less violated because the sampled unused resource units were all dependent of the used resource units, considering they were all sampled at the point straight across the used resource units. When researching merely the influence of the riverbank aspect this poses no problem since this will always differ from the used resource unit. However, when researching characteristics like water depth and riverbank steepness, the observed data will be very homogeneous due to spatial autocorrelation. A better approach to examine this, is to locate riverbanks with an absence of muskrat burrows and compare these with riverbanks with an abundance of burrows. To investigate this, a criteria needs to be set to determine the unused resource units. For example, if there is 500 meters of waterway without a muskrat burrow, that location should be determined as an unused resource unit. To examine this all the waterways in a district should be assigned as used or unused resource unit. Then, it should be investigated if there is a difference in water depth and riverbank steepness between both. Another way to investigate this is to check the clustering of the occurrence of muskrat burrows. There is, for instance, a big area located in the southern district where no muskrats are found. The characteristics in this area should be compared with the characteristics in the area with a high density of muskrat burrows. These are some advices for future research, but it is also advisable to reflect on previous research conducted for Waterschap Zuiderzeeland.

Previous research done by Maarten Tuijl (2015) examined the preference of muskrats for steep riverbanks. Tuijl only had 51 data points, all of them used resource units, within a very small research area. Tuijl found no relationship between steep riverbanks and the occurrence of muskrat burrows. An explanation for his conclusion can be that his research area was too small and that there was too little variance of riverbank steepness due to spatial autocorrelation.

The research of Rick Heeres and Martijn Struijf (2015) examined the relationship between protected and unprotected riverbanks and the presence of muskrat burrows. Heeres and Struijf found no difference between protected and unprotected riverbanks. An explanation for this can be found in the fact that they conducted their research in an area with no active muskrat population control, which leads to higher muskrat populations and more territorial pressure.

This research did find a relationship between the aspect of riverbanks and the presence of muskrat burrows. The strongest relationship was found between the occurrence of muskrat burrows and the aspect of the riverbank towards the south. However the relationship between the aspect towards the east was almost nearly as determining. The reason for this relationship towards the east can be found in the rectangular shape of the waterways in the province of Flevoland. This is why most of the riverbanks that are located towards the south are also located towards the west.

Future research is definitely recommended, this research needs to be conducted on a bigger scale and further research needs to be done to add more explanatory predictor variables to the model. Other predictor variables that could be examined are the agricultural activity and the riverbank vegetation. Agricultural activity can influence the nest site location in the differentiation between open grasslands and fields with a higher vegetation cover. Next to open grasslands, muskrats could feel less safe and therefore prefer areas with a higher vegetation cover. This could be examined using LiDAR-data of the Flevoland area or by the using the Landelijk Grondgebruik Nederland maps. Riverbank vegetation could also be determined using LiDAR-data and could influence the nest-site selection of muskrats in either being present or absent. If there is an abundance of riverbank vegetation it would seem obvious that this is a more favorable nest site location for muskrats. Adding this variable to the prediction model could therefore be very promising in improving the overall accuracy. Riverbank

(20)

vegetation could also be related to the agricultural activity because farmers could cut away the aquatic vegetation.

The most important variable that needs more examining is the riverbank steepness. This need to be done through a new and more extensive approach as mentioned above. Furthermore, more data concerning the aspect need to be gathered. The most promising unexamined predictor variable could be the riverbank vegetation.

The biggest recommendation for the Waterschap Zuiderzeeland is that more data needs to be gathered by the muskrat population controllers in the field and stored through the MuRa-app. The muskrat catchers should note information concerning terrain and water characteristics after placing the conibear traps. They could easily determine the water depth, riverbank aspect, agricultural activity, and riverbank vegetation while conducting their daily activities. This could quickly provide more insight in the species distribution of muskrats and could be very meaningful for future research.

Furthermore, during this research some difficulties were encountered while using the vangst registratie website. The system was updated to a new version during this research. The purpose of this update was to improve the interface. Some improvements were made but also a lot of useful features disappeared, such as selecting which traps were used or selecting time frames. This problem needs to be brought forward to the company that is responsible for this update, Nieuwland Geo-informatie.

(21)

Conclusion

After an extensive examination and discussion of the results the following questions can be answered: Do bank aspect, bank steepness and water depth determine the occurrence of muskrat burrows? After the first univariate analyses no relationship was found between bank steepness or water depth and the occurrence of muskrat burrows. A relationship was found between the bank aspect and the presence of muskrat nests. The strongest association was found between the aspect towards the south although it differed very little from the aspect towards the east. The rectangular shape of the waterways in the province of Flevoland could explain this relationship of the eastern aspect.

To what extend is it possible to predict the nest site locations of muskrat in Flevoland on the basis of terrain and water characteristics?

The best model to predict the nest site locations of muskrats in Flevoland is the univariate model with degrees south as a predictor variable. With this model it is possible to predict 61.5 percent of the locations correctly. To improve the overall accuracy more predictor variables need to be added to the model. Follow up research concerning the riverbank steepness could contribute to constructing a more accurate prediction model. New research needs to be done focusing on the relationship between riverbank vegetation and the occurrence of muskrat burrows.

The created prediction model will not cause a big breakthrough in muskrat damage control. Muskrat population controllers can be advised to focus a bit more on riverbanks located towards the south. In order to implement preventive measures this research is not accurate enough, as mentioned above, additional advanced follow up research is definitely needed.

(22)

References

Ahlers, A.A., Heske, E.J., Schooley, R.L. & Mitchell, M.A. 2010, "Home ranges and space use of muskrats Ondatra zibethicus in restricted linear habitats", Wildlife Biology, vol. 16, no. 4, pp. 400-408.

Allen, A.W. & Hoffman, R.D. 1984, Habitat suitability index models: muskrat.

Barends, F. 1998, "Muskusrat en Beverrat: mooi maar lastig", LEVENDE NATUUR, vol. 99, pp. 12-17. Benthem, M. 2015, Detection of cavities in levees caused by the muskrat and other mammals by use of

geophysical methods.

Errington, P.L. 1963, Muskrat populations, Iowa State University Press Ames.

Fielding, A.H. & Bell, J.F. 1997, "A review of methods for the assessment of prediction errors in conservation presence/absence models", Environmental Conservation, vol. 24, no. 01, pp. 38-49. Heeres, R.W. & Struijf, M.S. 2016, Muskusratten bouwen in oevers. Afstudeerrapport 2016 Waterschap

Zuiderzeeland, Lelystad.

Heidinga, D. 2006, "Pluizige plaagdieren, Ecologie en bestrijding van de muskusrat".

Landelijk jaarverslag 2007 muskus- en beverrattenbestrijding. Landelijke Coördinatie Commissie Muskusrattenbestrijding (LCCM), Tiel, The Netherlands (2008).

Le Boulengé, E. & Boulengé-Nguyen, L. 1981, "Ecological study of a muskrat population", Acta Theriologica, vol. 26, no. 4, pp. 47-82.

Liu, C., White, M. & Newell, G. 2009, "Measuring the accuracy of species distribution models: a review", Proceedings 18th World IMACs/MODSIM Congress. Cairns, AustraliaCiteseer, , pp. 4241.

Manel, S., Dias, J. & Ormerod, S.J. 1999, "Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird", Ecological Modelling, vol. 120, no. 2, pp. 337-347.

Manel, S., Williams, H.C. & Ormerod, S.J. 2001, "Evaluating presence–absence models in ecology: the need to account for prevalence", Journal of Applied Ecology, vol. 38, no. 5, pp. 921-931. Manly, B., McDonald, L., Thomas, D., McDonald, T. & Erickson, W. 2002, "Resource selection by

animals: statistical analysis and design for field studies", Nordrecht, The Netherlands: Kluwer. Meyer, C.B., Miller, S.L. & Ralph, C.J. 2004, "Logistic regression accuracy across different spatial and

temporal scales for a wide-ranging species, the Marbled Murrelet", .

Meynard, C.N. & Quinn, J.F. 2007, "Predicting species distributions: a critical comparison of the most common statistical models using artificial species", Journal of Biogeography, vol. 34, no. 8, pp. 1455-1469.

Tuijl, M. 2015, Muskusrattenschade in relatie tot de hellingsgraad van bovenwater oevertaluds. Onderzoeksrapport 2015 Waterschap Zuiderzeeland, Lelystad.

Van Troostwijk, Wilco Julius Doude 1976, The musk-rat (Ondatra zibethicus L.) in the Netherlands: its ecological aspects and their consequences for man, Troostwijk.

Van Vliet, F. & Lengkeek, W. 2007, Alternatieve strategieën voor de bestrijding van muskusratten: Haalbaarheidsstudie en voorbereiding veldexperimenten.

Zandberg, F., Jong, P.d. & Lunteren, J.v. 2011, "Muskusrat (Ondatra zibethicus): op alternatieve wijze schade voorkomen".

(23)

(24)

Appendix B. Confusion matrices univariate model Confusion matrix False (predicted) True (predicted) False (observed) 250 156 True (observed) 156 250

Confusion matrix south Confusion matrix False (predicted) True (predicted) False (observed) 239 167

(25)

Appendix C. Confusion matrices multivariate model

Confusion matrix east & south

Confusion matrix east & south & riverbank steepness

Appendix D. Confusion matrices five-fold cross-validation.

Univariate model, aspect south. Confusion matrix False (predicted) True (predicted) False (observed) 48 33 True (observed) 33 48 Confusion matrix False (predicted) True (predicted) False (observed) 44 37 True (observed) 38 44

(26)

Confusion matrix False (predicted) True (predicted) False (observed) 52 30 True (observed) 29 52 Confusion matrix False (predicted) True (predicted) False (observed) 49 32 True (observed) 33 48 Confusion matrix False (predicted) True (predicted) False (observed) 46 35 True (observed) 35 46

Multivariate model, aspect east and south. Confusion matrix False (predicted) True (predicted) False (observed) 51 30 True (observed) 30 51 Confusion

(27)

Confusion matrix False (predicted) True (predicted) False (observed) 49 32 True (observed) 32 49 Confusion matrix False (predicted) True (predicted) False (observed) 52 29 True (observed) 29 52 Confusion matrix False (predicted) True (predicted) False (observed) 44 37 True (observed) 38 44

(28)

Appendix E. Matlab script for analysis

% Script Bachelor thesis Reint de Koning 10111085 % Prediction model nest site location muskrats % University of amsterdam

clear all close all clc

%Import data

oost = csvread('data_rayon_oost.csv',1);

zuid = csvread('data_rayon_zuid.csv',1);

%% Assign variables % Presence Absence tf_oost = oost(:,2); tf_zuid = zuid(:,2); % Degrees North deg_north_oost= oost(:,3); deg_north_zuid= zuid(:,3); % Degrees East deg_east_oost= oost(:,5); deg_east_zuid= zuid(:,5); % Talud talud_oost= oost(:,7); talud_zuid=zuid(:,7); % Water depth waterdepth_oost= oost(:,8); waterdepth_zuid= zuid(:,8);

% Ammount muskrats catched

ammount_catched_oost = oost(:,9); ammount_catched_zuid = zuid(:,9);

% Ammount muskrats placed

ammount_placed_oost = oost(:,10); ammount_placed_zuid = zuid(:,10);

%% combine rayons

dnt = [deg_north_oost;deg_north_zuid]; % degrees south

det = [deg_east_oost;deg_east_zuid]; % degrees east

wdepth=[waterdepth_oost;waterdepth_zuid]; % water depth

talud=[talud_oost;talud_zuid]; % talud

y = [tf_oost;tf_zuid]; % occurrence burrows

y1=logical(y); % create logical of occurrence

for calculating AUC

%% Univariate analysis south

mdl_south = fitglm(dnt,y,'linear','Distribution','binomial');

(29)

% Calculate AUC, AIC & Threshold values

mdl_south.ModelCriterion;

scores1 = mdl_south.Fitted.Probability;

[Xsouth,Ysouth,Tsouth,AUCsouth] = perfcurve(y1,scores1,'true');

mdl_east.ModelCriterion;

scores2 = mdl_east.Fitted.Probability;

[Xeast,Yeast,Teast,AUCeast] = perfcurve(y1,scores2,'true');

mdl_wdepth.ModelCriterion;

scores3 = mdl_wdepth.Fitted.Probability;

[Xwdepth,Ywdepth,Twdepth,AUCwdepth] = perfcurve(y1,scores3,'true');

mdl_talud.ModelCriterion;

scores4 = mdl_talud.Fitted.Probability;

[Xtalud,Ytalud,Ttalud,AUCtalud] = perfcurve(y1,scores4,'true');

%%Combine east and south degrees as predictor variables

x1 = [dnt,det];

% Multivariate analysis East+South

mdl_totalaspect = fitglm(x1,y,'linear','Distribution','binomial')

% Calculate AUC & AIC

mdl_totalaspect.ModelCriterion; scores5 = mdl_totalaspect.Fitted.Probability; [Xtotalaspect,Ytotalaspect,Ttotalaspect,AUCtotalaspect] = perfcurve(y1,scores5,'true'); % Visualise output plot(Xtotalaspect,Ytotalaspect)

xlabel('False positive rate')

ylabel('True positive rate')

title('ROC for Classification by Logistic Regression East and South')

%%Add talud to east and south degrees as predictor variables

x2 = [dnt,det,talud];

% Multivariate analysis East+South

mdl_aspect_talud = fitglm(x2,y,'linear','Distribution','binomial')

% Calculate AUC & AIC

mdl_aspect_talud.ModelCriterion; scores6 = mdl_aspect_talud.Fitted.Probability; [Xaspect_talud,Yaspect_talud,Taspect_talud,AUCaspect_talud] = perfcurve(y1,scores6,'true'); % Visualise output figure(2) plot(Xaspect_talud,Yaspect_talud)

xlabel('False positive rate')

ylabel('True positive rate')

title('ROC for Classification by Logistic Regression East, South And

Talud')

(30)

for i= 1:812

pred(i) = exp(2.0453-0.022727*dnt(i))/(exp(2.0453-0.022727*dnt(i))+1);

end

% Classify presence or absence with default threshold

for i=1:812 if pred(i)<0.6 pred(i)=0; else pred(i)=1; end end

% create confusion mat

[Csouth,~] = confusionmat(y,pred)

%% Construct model for degrees south and east

for i= 1:812

pred(i) = exp(1.4398-0.0082407*dnt(i)-0.007758*det(i))/(exp(1.4398-0.0082407*dnt(i)-0.007758*det(i))+1);

end

[Ctotal_aspect,~] = confusionmat(y,pred)

%% Construct model for degrees south, east and talud

for i= 1:812

pred(i) = exp(1.0528 0.008243 *dnt(i)

0.0077346*det(i)+0.012184*talud(i))/(exp(1.0528 0.008243 *dnt(i) -0.0077346*det(i)+0.012184*talud(i))+1);

end

(31)

testx10= x1(651:end,:); trainx2 = x1(163:end,:); trainx4 = [testx2;x1(326:end,:)]; trainx6 = [x1(1:325,:);x1(488:end,:)]; trainx8 = [x1(1:489,:);x1(650:end,:)]; trainx10 = x1(1:651,:); testy2= y(1:162,:); testy4= y(163:325,:); testy6= y(326:488,:); testy8= y(489:650,:); testy10= y(651:end,:); trainy2 = y(163:end,:); trainy4 = [testy2;y(326:end,:)]; trainy6 = [y(1:325,:);y(488:end,:)]; trainy8 = [y(1:489,:);y(650:end,:)]; trainy10 = y(1:651,:);

%% Cross validation for multivariate model % Calculate coeficients first fold

mdl_totalx2 = fitglm(trainx2,trainy2,'linear','Distribution','binomial')

% make model of training data set

dnt = testx2(:,1); det = testx2(:,2);

for i= 1:length(testx2)

predx2(i) = exp( 2.0495 -0.0082421*dnt(i) -0.014532*det(i))/(exp( 2.0495 -0.0082421*dnt(i) -0.014532*det(i))+1);

end

% classify training data set

for i=1:length(testx2) if predx2(i)<0.5 predx2(i)=0; else predx2(i)=1; end end

% create confusion matrix of data

[Ctesty2,~] = confusionmat(testy2,predx2)

% Calculate coeficients 2nd fold

predx4(i) = exp( 1.0016 -0.0091378*dnt(i) -0.0020384*det(i))/(exp( 1.0016 -0.0091378*dnt(i) -0.0020384*det(i))+1);

end

for i=1:length(testx4)

if predx4(i)<0.5 predx4(i)=0; else predx4(i)=1;

end end

(32)

% Calculate coeficients third fold

predx6(i) = exp(1.0989 0.0073542*dnt(i)

-0.0048116*det(i))/(exp(1.0989 -0.0073542*dnt(i) -0.0048116*det(i))+1);

end

% Calculate coeficients 4th fold

predx8(i) = exp(1.4777 0.0088744*dnt(i)

end

% Calculate coeficients fold

predx10(i) = exp(1.6094 0.0074973*dnt(i)

end

(33)

%% construct for univariate model

% divide only degrees north in training and test data

x1=[deg_north_oost;deg_north_zuid]; testx2= x1(1:162,:); testx4= x1(163:325,:); testx6= x1(326:488,:); testx8= x1(489:650,:); testx10= x1(651:end,:); trainx2 = x1(163:end,:); trainx4 = [testx2;x1(326:end,:)]; trainx6 = [x1(1:325,:);x1(488:end,:)]; trainx8 = [x1(1:489,:);x1(650:end,:)]; trainx10 = x1(1:651,:); % Calculate coeficients

mdl_totalxn2 = fitglm(trainx2,trainy2,'linear','Distribution','binomial')

dnt = testx2(:,1);

predxn2(i) = exp( 0.75721 0.0084136*dnt(i) )/(exp( 0.75721 -0.0084136*dnt(i))+1);

end

if predxn2(i)<0.5 predxn2(i)=0;

else predxn2(i)=1;

end

[Ctesty2UNI,~] = confusionmat(testy2,predxn2)

% Calculate coeficients 2nd fold

dnt = testx4(:,1);

predxn4(i) = exp( 0.82021 0.009161*dnt(i) )/(exp( 0.82021 -0.009161*dnt(i))+1);

end

else predxn4(i)=1;

end

(34)

dnt = testx6(:,1);

predxn6(i) = exp( 0.664 0.0073622 *dnt(i) )/(exp( 0.664 -0.0073622 *dnt(i) )+1);

end

else predxn6(i)=1;

end

dnt = testx8(:,1);

end

else predxn8(i)=1;

end

mdl_totalxn10 =

fitglm(trainx10,trainy10,'linear','Distribution','binomial')

dnt = testx10(:,1);

end

if predxn10(i)<0.50 predxn10(i)=0; else predxn10(i)=1;