Mapping Flood Defences: Improving Global Geospatial Coverage of Dykes and Levees

(1)

Mapping Flood Defences

Improving Global Geospatial Coverage of

Dykes and Levees

By Fergus Miller Kerins

Assessor: Erik Cammeraat, Co-assessor: Emile van Loon

Daily Supervisors: Dirk Eilander & Timothy Tiggeloven

MSc Earth Science: Environmental Management Student Number: 12817031

(2)

I would like to express my kindest gratitude to my supervisor Erik Cammeraat and my assessor Emile van Loon. I also extend special thanks to my daily supervisors Dirk Eilander and Timothy Tiggeloven who gave me countless useful tips and guidance without which I couldn’t have completed this MSc thesis. Additionally, I extend a special thanks to my colleague at the UvA Bart Hoekstra, who gave me invaluable suggestions and help with my programming.

(3)

Abstract

Flooding is hugely damaging and will become more so as the century progresses. Flood models are crucial in providing decision makers with the information they need to effectively respond to flood risk. However, the accuracy of regional and global scale flood models is limited by a lack of data on flood defences. This MSc thesis responds directly to this challenge by assessing the popular crowdsourced database OpenStreetMap (OSM) as a source of flood defence data. Data mining shows that the OSM database contains 128,859 flood defence features. The majority of these are in Europe (96,599 features) but the database still offers global coverage. Crowdsourced data is known to have reliability issues. In OSM’s case this translates in features being incorrectly tagged. As such, this thesis assesses the accuracy of OSM flood defence features using a Random Forest Classifier model. The model is trained using a manually classified subset of OSM flood defence features and then used to classify the entire OSM flood defence dataset. According to the classification, 2,063 defences are correctly tagged, and 126,273 defences are incorrect. This low percentage of correct features is deemed likely to be an underestimation. As such, steps are suggested to build upon this methodology and provide a more balanced picture of which OSM flood defence features are correctly tagged. Doing so would open a large global dataset of flood defences to flood modellers, helping combat the key issue of a lack of available flood defence data limiting flood model accuracy. By extension building upon this research would be of great benefit to global efforts to minimise the severe risks of economic damage and loss of life posed by flooding.

(4)

Chapter 1: Introduction ... 5

Chapter 2: Methodology ... 9

Chapter 3: Results ... 15

Chapter 4: Discussion ... 19

Conclusion ... 25

References ... 26

Appendix ... 30

(5)

5

Chapter 1: Introduction

Flooding is a huge and growing problem (Tiggeloven et al., 2020). One response to flood risk is building physical structures such as dykes and levees which protect people and buildings from flooding. Decision-makers need accurate information about flood risk to both determine where is best to build these structures and estimate the effect they will have. Such information is available from flood models; however, the effectiveness of these models is limited by the fact there is no accurate global database of flood defence structures (Scussolini et al., 2016). This MSc thesis responds directly to this problem by examining OSM as a source of global data of flood defences. Through this, it seeks to enable use of this database by flood modellers. By extension, it intends to improve the information available to decision makers and the effectiveness of global responses to flooding.

This first chapter introduces the importance of improving global data of flood defences and highlights the huge societal benefit of doing so. It examines the main relevant literature, discusses the gaps in currently available flood defence data and explains how it is proposed these gaps are addressed. In terms of structure, first it looks at flooding in general, it then discusses flood models and how these are limited by lack of data on flood defences. Following this it examines volunteer geographic information (VGI) as a potential data source and explores how it could be utilised to improve global data coverage of flood defences. It finishes by highlighting the key research questions this MSc thesis seeks to answer.

1.1. Flooding

Flooding is the most damaging natural hazard affecting humanity, causing over 225,000 deaths and $1.6 trillion losses between 1980 and 2016 (Jongman et al., 2018). Flooding is generally conceptualised in terms of flood risk which comprises of hazard times exposure times vulnerability (United Nations Office for Disaster Risk Reduction, 2016). Flood hazard is the actual physical process causing flooding; exposure consists of the human and non-human elements, such as buildings and infrastructure, potentially exposed to flooding; and vulnerability represents how vulnerable these human and non-human elements are (Jongman et al., 2018).

Flood risk will only increase as the century progresses due to increased extreme weather events (Schiermeier, 2011), sea level rise (Tiggeloven et al., 2020), and population and economic growth in flood-prone areas (Scussolini et al., 2016). Low income countries are particularly vulnerable to floods. This is often due to a lack of adequate flood defences, disaster response capabilities and social and financial protection. (Jongman et al., 2015). Cities such as Mumbai already suffer substantial flood risk (Ranger et al., 2011) and this will increase substantially unless adequate responses are taken.

1.2. Flood Models

In order to respond effectively to flood risk, policymakers need high-quality information about where future flood events are likely to take place and their projected impact (Jongman et al., 2018). Quantification of flood risk offered by flood models allows for proactive, cost-effective mitigation (Hall et al., 2012). Data of flood hazard, exposure and vulnerability can inform policy response to climate change, identify at-risk hot-spots, ensure the most effective mitigation options are chosen and quantify the financial costs and benefits of particular flood mitigation strategies (Scussolini et al., 2016; Hall et al., 2012). Flood models can examine coastal flooding (e.g. Narayan

(6)

6

et al., 2012), riverine flooding (e.g. Leander et al., 2008), pluvial flooding (e.g. Kaspersen et al., 2017) or a combination of flooding types (e.g. Jongman, Ward & Aerts, 2012).

Flood models have benefited from factors such as enhanced computational capacity and increased availability of high-resolution data (Wing et al., 2019). A key development was NASA’s Shuttle Radar Topography Mission (SRTM) which offers global DEM (Digital Elevation Model) coverage. Developments have meant that in addition to looking at local scale flooding scale (e.g. Du et al., 2020), models can assess regional scale (e.g. Wing et al., 2017; Bozza et al., 2016) and global scale flooding (e.g. Tiggeloven et al., 2020). However, it has been noted that there is a clear performance gap where local models outperform those that operate on a larger scale (Wing et al., 2019). Numerous academic papers point to a lack of adequate information on the presence and nature of flood defences undermining large scale model performance (e.g. Scussolini et al 2016; Wing et al., 2017; Wing et al., 2019).

There are a few factors which explain the frequent inadequate representation of flood defences in flood models. Schumann (2014), describes that there is a disparity in DEM quality in different regions. Even the most accurate global DEM’s do not offer sufficient resolution detail to detect certain terrain features relevant to flooding. Another issue is computational capacity; large scale flood models are highly computationally demanding which can mean they have to simplify terrain (Wing et al., 2019). Essentially in large scale and global flood models, the pursuit of computational efficiency and/or lack of adequate resolution DEM coverage leads to terrain being ‘smoothed out’. Flood defences which are often in low elevation floodplains are effectively removed from models. Scussolini et al. (2016) describe that in the absence of accurate information about flood defences, models frequently assume no protection. As Ward et al. (2013) demonstrate this leads to an overestimation of flood hazard, particularly for frequent smaller floods which in reality are protected against. It is abundantly clear from the literature that there is a knowledge gap with lack of adequate information about flood defences on a global scale and this is impeding the accuracy of flood models (Wing et al., 2019; Scussolini et al., 2016; Ward et al., 2013). If this knowledge gap can be addressed the quality of large-scale flood models and by extension the quality of information given to policy makers as they respond to flooding can be improved.

1.3. Improving Data Coverage of Flood Defences

There have been some attempts to address the issue of a lack of global data on flood defences. Scussolini et al. (2016) created FLOPROS (FLOod PROtection Standards), which is a database offering global flood protection standards. The FLOPROS database uses the best available information from three layers -the design layer, the policy layer, and the model layer- to provide protection standards for administrative units globally. Whilst FLOPROS is very useful it has limitations. Scussolini et al. (2016) acknowledge that due to a lack of empirical data, developing regions including South America and Africa have less accurate information than developed regions such as Europe and North America. The authors do suggest that FLOPROS can be continually improved with crowdsourced information including VGI datasets.

Fathom-US is a 30 m resolution hydrodynamic model of the conterminous United States created by Wing et al. (2017) using publicly available data. Their model incorporates data from the U.S. Army Corps of Engineers National Levee Database (NLD) (USACE, 2020) to accurately represent flood defences. Under testing the model produces more accurate flooding coverage when flood defences are ‘defended’ (i.e. inputted from the NLD) as opposed to ‘undefended’ (only represented when they show in the DEM independent of processing). The model performs at a high standard and offers greater coverage than combining available local studies. However, this method is not applicable on a wider scale as the NLD is only available in the U.S.

(7)

7

Wing et al. (2019) offer a particularly promising method to improve flood defence representation in models. They use derivatives calculated from the source data of the DEM used in Fathom-US (Wing et al., 2017) to automatically detect levee placement and crest height using an algorithm. These levee crests can then be ‘defended’ (i.e. explicitly retained in the model) to improve flood model accuracy. Wing et al. (2019) validate the algorithm in California using the California Levee Database (CLD). They also re-run Fathom-US using the newly defended levees and test the algorithm in the Po River floodplain, Italy to assess its wider global potential. The algorithm performs well in identifying levees and determining levee crest heights and therefore improving flood model accuracy in the various test locations. However, despite its strengths, the algorithm is essentially reliant on high-resolution DEM input (at least 1/3 arc-second). The widely available SRTM imagery does not offer sufficient resolution (SRTM has 3 arc-second resolution) (USGS, n.d.). As such, applicable areas are limited to Western Europe, North America, and some of Australia.

As Wing et al. (2019) note, all attempts to automatically detect flood defences prior to theirs use quantitative definitions to define levees (i.e. parameter thresholds) which are not globally applicable. Research by Choung (2014) who maps levees in river basins in South Korea and Casas et al. (2012) who detect levee stability in Southern California both use LiDAR (Light Detection and Ranging) data which is not feasible on a large scale. As such, Wing et al. (2019) are deemed here to provide the best framework for dyke and levee detection using DEM input on a large scale.

1.4. Voluntary Geographic Information and OpenStreetMap

It is proposed here that a useful avenue for responding to the problem of a lack of global coverage of flood defences is VGI. VGI is where citizens create geographic information themselves on bespoke online platforms. It is democratic in that there are no requirements in terms of experience or expertise from contributors (Senaratne et al., 2017). There are key advantages to VGI, namely it offers, large data volume, abundant information, and potentially global coverage, all delivered at a low cost (Senaratne et al., 2017).

The most prominent open-source platform for VGI is OSM which offers an editable world map primarily created by volunteers all over the world (Ma, 2017; Bozza et al., 2016). OSM should be seen as a key dataset for hydrologists and flood modellers, containing over 30 million tagged objects that directly relate to water (OpenStreetMap Taginfo, n.d.). OSM utilises georeferencing where satellite imagery (available through products such as Google Earth) is used to visually locate locations and features. The user can then manually record the co-ordinates of features such as buildings, trees and waterbodies (Goodchild, 2007).

With active mapping communities across the globe, flexible data contribution mechanisms and an ever-growing contributor community that already encompasses millions (Senaratne et al., 2017), OSM’s data coverage will increase going forward. Moreover, OSM’s open-source nature (Goodchild, 2007) mean it can be used flexibly for a multitude of hydrological and modelling applications. It is for these reasons that OSM is seen as an appropriate platform on which to improve geospatial coverage of flood defences.

Despite the clear opportunities presented by OSM, its volunteer contributor model can present data accuracy and reliability issues (Schellekens, 2014). Goodchild (2007) notes that VGI often comes without any form of reference or citation. Senaratne et al. (2017) add that OSM involves contributors using different tools and technologies with varying precision and for heterogenous purposes, which can affect its data quality.

Multiple responses to VGI reliability challenges have been suggested. Goodchild and Li (2012) discuss that crowdsourcing (where a group validates and corrects individual contributions),

(8)

8

social approaches (where trusted individuals serve as gatekeepers) and geographic approaches (which rely on fundamental geographic laws for verification) are all valid methods for assessing VGI quality. Donchyts et al. (2016) argue that automated tools represent useful methods to validate OSM data quality. This is particularly attractive given the huge amount of data contained in OSM; manual verification of all OSM features is effectively impossible.

1.5. Problem Addressed

This research is the first to attempt to assess the accuracy of tagged flood defences in OSM on a global scale. As such it is novel research that can further the use of arguably the largest global database of flood defences. It takes the position, in agreement with Donchyts et al. (2016), that automated tools are necessary for the validation of OSM data quality. Wagenaar et al. (2020) note that machine learning is poised to have a large impact on how flooding is modelled going forward. They describe that it can be suitable for classification problems relating to geospatial data. This thesis uses Random Forest Classification (a type of machine learning) to examine the quality of tagged flood defences in OSM. Other papers use Random Forest Classification when assessing flood risk in general (e.g. Wang et al., 2015) and classifying salt marsh vegetation (e.g. Van Beijma, Comber and Lamb, 2014) but to the knowledge of this author it has never before been used explicitly to assess the accuracy of mapped flood defence features on a large scale. This novel approach can be used to provide flood modellers with the information necessary to allow them to use OSM flood defence data in local, regional, and global models. As such, the research has the potential to offer great benefits to flood modellers and by extension help decision-makers respond to the large, and ever-growing, challenges posed by global flooding.

1.6. Research Questions

The key research questions addressed in this research, as it assesses OSM as a source for flood defence data, are as follows:

What is the current availability and accuracy of flood defence data in OSM?

• How many flood defence features are available in OSM?

• How accurately tagged are these flood defence features? Are specific tags used for flood defence features more accurate than others?

Can the quantity and quality of geospatial information about global flood defences be improved using OSM?

(9)

9

Chapter 2: Methodology

This chapter outlines the methodology followed to assess the accuracy of OSM flood defence features, thus enabling these features to be used in flood models. Figure 1 gives an overview of these methods and contains four main steps: (1) First all OSM flood defence features are extracted from the entire OSM dataset, allowing for an assessment of their quantity and quality (henceforth know as ‘All Defence Features’), (2) A subset of these features is manually classified

(henceforth known as the ‘Classified Defences’), these features are later used to train the Random Forest Classifier model (RFC) which classifies the entire dataset, (3) Summary statistics are

Figure 1: Flowchart of the methodology used in this thesis. Firstly the All Defence Features

are extracted, secondly these are manually classified (creating the Classified Defences), thirdly summary statistics to be used for machine learning classification are calculated from the Input Feature Datasets, and finally the All Defence Features dataset is classified as correct or incorrect using a RFC model.

(10)

10

calculated for the Input Feature Datasets (see Table 1), for both the Classified Defences and All Defence Features datasets, these statistics serve as the other required inputs for the RFC, (4) The RFC is used to classify the All Defence Features dataset and determine the number of correct and incorrect flood defences.

The RFC classifies by using the Input Feature Dataset summary statistics for the Classified Defences to learn what values indicate a correct or incorrect defence. It can then classify the All Defence Features as correct or incorrect based on their Input Feature Dataset summary statistics. The Input Feature Dataset summary statistics used for classification (see Figure 1) are calculated from four Input Feature Datasets (these are summarised in Table 1). These datasets are: Global Surface Water: Maximum Water Extent (GSW) (Pekel et al., 2016), Global Flood Plains (GFP250m, or GFP) (Nardi et al., 2019), Multi-Error-Removed Improved Terrain Digital Elevation Model (MERIT DEM, or DEM) (Yamazaki et al., 2017) and Multi-Error-Removed Improved Terrain Hydro: Height Above Nearest Drainage (MERIT Hydro: HAND, or HAND) (Yamazaki et al., 2019).

2.1. Data Extraction

The first step to establishing the accuracy of OSM flood defences is to determine exactly what flood defence features the dataset contains and where these are (see Figure 1). To do this, relevant tags for flood defence features are established, the entire OSM dataset is downloaded, and the flood defence features are extracted. As OSM consists of crowdsourced data it is prone to inconsistencies in labelling of features (Schellekens, 2014). As such flood defence features cannot be found using only a singular feature tag (such as ‘man_made=dyke’). First the most common tags used to describe flood defences must be found. This can be achieved through searching OSM Wiki (OpenStreetMap Wiki, n.d.) and the website Taginfo, which allows users to access statistics about tagged features in the OSM database (OpenStreetMap Taginfo, n.d.). Features can be found using the search terms “dyke” and “levee”. Table 2 shows the most common flood defence tags used by OSM contributors.

Dataset Dataset Category Relevant

Variables Source Data Extension

OpenStreetMap Flood Defence

Features All Defence Features OpenStreetMap (n.d.) Vector Data Maximum

Water Extent (GSW)

Water Coverage Location Pekel et al.

(2016) GeoTIFF 30m MERIT Hydro:

HAND Height Above Nearest Drainage (HAND) Location, height above drainage Yamazaki et al. (2019) GeoTIFF 90m

MERIT DEM DEM Location,

elevation Yamazaki et al. (2017) GeoTIFF 90m GFPLAIN250m Floodplain

Location

Location Nardi et al. (2019)

GeoTIFF 250m

Table 1: Table showing the datasets used in this thesis including relevant variables, source,

(11)

11

Next the entire OSM database is downloaded. This is done using the Geofabrik website, which facilitates bulk downloads of OSM data for individual countries and regions (Geofabrik, n.d.). This data is downloaded for all world regions. Following this Python is used to scrape the entire OSM database and extract the tagged flood defence features. The flood defence features are then combined into one coherent and manageable dataset using GIS (Geographic Information System) software. This is the All Defence Features dataset (see Figure 1). The All Defence Features dataset contains 128,859 features which are distributed across the globe (see Figure 2). The defences are split amongst different tags as shown in Table 2. Note the sum of tagged features shown in the table exceeds the total extracted as some features have duplicate tags.

Tag Feature Count

man_made = embankment 115,199 man_made = dyke 16,617 tiger:name_base = Levee 571 embankment = levee 240 tiger:name_base = Dyke 131 waterway = dyke 127 man_made = levee 98 embankment = dyke 85

Table 2: Table showing the most common tags used to describe flood defences in OSM. Note

these tags were found by searching the tags “dyke” and “levee” on the Taginfo website (OpenStreetMap Taginfo, n.d.).

Figure 2: Figure showing the global spread of tagged flood defences in OSM. Note

concentrations of features in Europe and Eastern Asia. Continents such as Africa, Oceania and South America appear to be underrepresented in relation to their landmass.

(12)

12

2.2. Data Classification

Next it is important to classify a subset of the All Defence Features which can be used to train the RFC (see Figure 1). This is done through first randomly selecting features to be classified and then classifying these through visual observation with the aid of predetermined classification criteria. To randomly select a subset of features, random points are generated, throughout the entire global landmass (minus Antarctica which, with 40 tagged defence features, is deemed to have an insufficient number of flood defence features for useful assessment). These random points are then snapped to the nearest flood defence and these are the defences used for assessment. This creates an even spatial distribution of defences across the globe. It is important to have a consistent way to determine whether a flood defence is correct or not. To achieve this a set of criteria is created, these criteria are as follows:

• Only man-made dykes and levees are defined as flood defences; natural structures are deemed out-with the scope of this study.

• Structures which only protect permanent water (e.g. a harbour) are not classified as defences. Whilst hydraulically relevant, structures such as this exist principally to ensure calm water as opposed to explicitly limiting flooding.

• Although they are also hydraulically relevant, dams and reservoirs are not deemed to be flood defences within the scope of this study as the majority of these structures have the principle purpose of retaining permanent water. It should be noted that omitting these is a subjective decision: some dams serve principally to protect against flash-floods. This decision is further addressed in Discussion Section 4.1.a.

• Small features -under 30m- are deemed unlikely to be a defence (including features protecting a compound). This is because floodwater could easily flow around these features.

• Features at high elevation are deemed less likely to be a defence (excluding features in valleys) as water here is deemed likely to run-off and drain away rather than accumulate. • Features not protecting anything of value (e.g. buildings/ roads etc.) are deemed less likely

to be a defence.

• Rail embankments are deemed unlikely to be a defence. These generally do not have a high enough crest-height to stop flood water.

• Features under forest are deemed less likely to be a defence as forest cover makes ongoing maintenance of such features difficult.

Manual classification is carried out through visually observing flood defences and assessing them according to the classification criteria. In this way features are labelled as correct or incorrect. Each individual defence is observed in Google Earth (Google Earth, n.d.) using both current and historical satellite imagery. In addition, each feature is visualised in GIS software and compared with OSM and Google Satellite base layers. In total 384 features are classified.

2.3. Calculation of Input Feature Dataset Statistics for Random Forest Classifier

The next steps concern calculating the Input Feature Dataset summary statistics for the Classified Defences (here the statistics are used to train the RFC) and the All Defence Features (here the statistics are used to classify the dataset). The process for calculating the Input Feature Dataset summary statistics for each defence feature dataset (Classified Defences and All Defence Features) is essentially the same. In both cases, summary statistics are calculated for the Input Feature Datasets introduced in Table 1. The justification for choosing these datasets is as follows: MERIT DEM (Yamazaki et al., 2017) is chosen as elevation has clear correlations with flood risk and therefore likelihood of tagged defence features being correct (Kulp & Strauss, 2019), MERIT Hydro HAND (Yamazaki et al., 2019) is chosen as HAND shows height relative to the nearest drainage

(13)

13

channel which is also relevant, GSW (Pekel et al., 2016) is chosen as proximity to water has a bearing on the need for flood defences and GFP (Nardi et al., 2019) is chosen as it indicates distance to flood plains which also influences the likelihood of a tagged defence feature being correct.

These statistics are calculated within a within a 1km buffer of the randomly generated point on each feature in the Classified Defences and All Defence Features datasets. The maximum distance of 1km is chosen as Tobler’s law suggests that things which are geographically closer together are more closely correlated (Waters, 2016). A smaller distance is not chosen as flood defence features often stretch for hundreds of metres and it is deemed important to capture geomorphology that surrounds most or all the feature. Statistics for the GFP and the GSW Input Feature Datasets include the Euclidean distance from floodplains and surface water respectively and this is generated using an additional pre-processing step. The key parts of the process are as follows:

• Euclidean distance from surface water is calculated for the GSW dataset and Euclidean distance from floodplains for the GFP dataset. Two new datasets, one for each set of Euclidean distances calculated, are created. This means there are now five global datasets: GSW, GSW Euclidean Distance, DEM, HAND and GFP Euclidean Distance (The GFP Euclidean Distance dataset replaces the original GFP dataset, as such it is henceforth referred to as ‘GFP’). Note the Input Feature Datasets now contains the extra dataset GSW Euclidean Distance. This is not explicitly shown in Figure 1 as an input as it was generated for this research and considered part of the calculation of summary statistics process. • Summary statistics are calculated for each of these datasets within a 1km buffer of a

randomly generated point on each feature.

• The results are combined into data frames which encompass the summary statistics for each feature (one data frame for the Classified Defences, one for the All Defence Features). • Each feature of the Classified Defences now includes a label indicating whether the defence is correct or incorrect and its Input Feature Dataset set summary statistics. Each feature of the All Defence Features now includes its Input Feature Dataset summary statistics but no label of correct or incorrect. These labels are one of the outputs of the RFC (see Section 2.4.).

The actual summary statistics generated vary per Input Feature Dataset. Note that as some Input Feature Datasets have more than one statistic (or variable) calculated, the individual statistics calculated are henceforth referred to as ‘Input Feature Dataset Variables’. For HAND the mean, standard deviation and range are calculated. This allows the mean height above the nearest drainage channel to be calculated, and also the standard deviation and range which are intended to account for the variability of elevation (above drainage channels) of the landscape. For the DEM, the mean and standard deviation are calculated to get average elevation and a proxy for elevation variation. Note the standard deviation of the DEM and HAND datasets are closely correlated but it is proposed the DEM standard deviation will capture terrain the HAND standard deviation will miss (for example a large drop which is separated from the feature being assessed by a small hill). For the GSW and GSW Euclidean Distance the means are calculated, thus indicating how much surface water is in the neighbourhood of each feature and the mean distance to this water. For the GFP Euclidean Distance the mean is calculated to indicate how far the feature is from a flood plain.

2.4. Classification Using Random Forest Classifier

The final steps involve training the RFC and using this to classify the All Defence Features dataset (see Figure 1). This allows an assessment of how many tagged features in the dataset are correctly labelled as defences and how many are incorrect; thus, providing flood modellers with a means to

(14)

14

assess how much of the dataset they can use and in what capacity. The process for training, evaluating, and using the model is described below:

• Initially the RFC is trained using the Input Feature Dataset Variables, and flood defence labels (correct or incorrect), of the Classified Defences.

• Next an evaluation is made of the accuracy of this classification. 5-fold cross-validation is used to give an accuracy score of prediction using unseen data.

• Following this the importance of individual Input Feature Dataset Variables is assessed. Partial dependence plots for each variable are generated allowing for the relationship between each variable and the RFC’s prediction to be further interpreted.

• Finally, the trained model is used to classify the All Defence Features dataset and also give an indication of the performance of specific flood defence tags (e.g. ‘man_made=dyke’). • Thus the final outputs from using the RFC are a classified All Defence Features dataset, a

breakdown of the performance of specific tags (which represent subsets of the All Defence Feature datasets), an accuracy score indicating the accuracy of the RFC and partial dependence plots which can be used to assess the marginal importance of Input Feature Dataset Variables.

RFC is an estimator which takes the aggregate results of a defined number of decision tree classifiers. This use of averaging both controls over-fitting and improves predictive accuracy (scikit-learn, n.d.). RFC is used as a classification machine learning model as it offers good predictive capability with unbalanced data (the Classified Defences dataset is biased towards incorrect defences) and requires minimal pre-processing (scikit-learn, n.d.). It is available as part of the scikit-learn module.

The default hyperparameter (a parameter external to the model) is used for the number of estimators (100). The max depth of trees is set at 5. Greater tree depth allows a tree to have more splits thus capturing more information but can lead to model overfitting; the depth of 5 is considered here as a balanced choice. 5-fold cross-validation is used to assess the performance of the model on unseen testing data. The relative importance of each of the Input Feature Dataset Variables in predicting whether a defence is correct or incorrect is determined using built-in scikit-learn methods which are further elaborated upon in Results Section 3.3.

Partial dependence plots are generated to allow for interpretation of the marginal effects of each Input Feature Dataset Variable on the prediction of the model. Partial dependence plots are designed to make machine learning easier to interpret. They show the effect of changing the value of one predicting variable (in this case an Input Feature Dataset Variable) whilst keeping all other predicting variables at fixed values (scikit-learn, n.d.).

Once the RFC has been trained using the Classified Defences it is used to classify the All Defence Features dataset. As indicated before it makes this prediction by assessing the Input Feature Dataset Variable summary statistics for both the Classified Defences correct and incorrect defence features. The output classified All Defence Features dataset is further subdivided by tag to assess the performance (i.e. percentage of correct and incorrect defence features) for specific tags.

(15)

15

Chapter 3: Results

This chapter outlines the key results of this study. First it discusses the number of flood defence features found in the OSM database (the All Defence Features) and how these were distributed both spatially and in terms of OSM tag. It then goes on to describe the results of feature classification dealing with the Classified Defences and the All Defence Features datasets. Next it examines the cross-validation accuracy for the RFC used to classify features. Finally, it looks at the relative importance of the different Input Feature Dataset Variables used with the aid of partial dependence plots.

3.1. Flood Defence Features

This section outlines the number of flood defence features in OSM and how they were distributed. The All Defence Features dataset contained a total of 128,859 features. These were globally distributed (see Figure 2). There were at least 2,000 features in every continent except Oceania (with 899) and Antarctica (with 40). However, distribution was not even; certain continents contained a much higher concentration of features. Europe had by far the most features with 96,599, in contrast Africa, with a greater landmass and population, had 2,570. Such discrepancies could be due to OSM having different levels of popularity in different regions. Table 3 outlines the number of tagged features per continent.

3.2. Feature Classification

This section deals with the results of the manual and automatic classification of OSM flood defence features. Following removal of features that contained no data values in their Input Feature Dataset Variable summary statistics, the total number of Classified Defences was 364. Of these 91 were classified as correct and 273 were classified as incorrect (this translates as 25% of features being classified as correct). Figure 3 shows how these features were distributed across the globe. There was a relatively even spatial distribution of correctly and incorrectly tagged features.

Continent Feature Count

Europe 96,599

Asia 15,427

North & Central America 8,221

Africa 2,570

South America 2,025

Oceania 899

Antarctica 40

(16)

16

The All Defence Features dataset classified by the RFC contained 2,063 flood defence features classified as correct and 126,273 features classified as incorrect. This translates as 1.61% of features being classified as correct. The percentage of features tagged as correct varied per tag. For example, for the most common tag, ‘man_made=embankment’, only 1.11% of features were classified as correct (see Figure 4). This translates as 1,263 correct features and 111,520 incorrect features. For the second most common tag, ‘man_made=dyke’, 4.51% of features were classified as correct, translating as 690 correct features and 15,287 incorrect features (see Figure 4).

Figure 3: Figure showing the global distribution of the manually classified flood

defence features. No clear patterns emerged in terms of differences in spatial distribution of the correct and incorrect features.

Figure 4: Figure showing the number of correct and incorrect flood defence tags for the

two most common tags. On the left is ‘man_made=dyke’ with 690 correct features. On the right is ‘man_made=embankment’ with 1,263 correct features. Given the dataset sizes, ‘man_made=dyke’ is actually around four times more likely to be correct than ‘man_made=embankment’. 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Correct Incorrect N u m b er o f Feat u re s

man_made=dyke

0 20000 40000 60000 80000 100000 120000 Correct Incorrect N u m b er o f Feat u re s

man_made=embankment

(17)

17

3.3. Accuracy Metrics and Feature Importance

This section deals with accuracy metrics generated for the RFC classification and discusses the relative importance of the different Input Feature Dataset Variables. For an accuracy assessment, cross-validation was used to assess the performance of the RFC on unseen data (scikit-learn, n.d.). 5-fold cross-validation for the training data gave a score of 0.75 with a standard deviation of 0.02. This score represents the probability (measured from 0-1) that the model has made a correct prediction. 5-fold cross-validation for the testing data gave an accuracy score of 0.70 with a standard deviation of 0.05.

The feature importance of each Input Feature Dataset Variable used in the RFC was ranked using inbuilt scikit-learn functionality. Feature importance represents the relative importance of a given variable in a model’s prediction compared to the other variables used (scikit-learn, n.d.). It is calculated by the total number of samples that reach the node containing this variable divided by the total number of all samples (Ronaghan, 2019). The combined value of all Input Feature Dataset Variables importance is always 1, so the variables are given a proportion of this total in line with their relative importance. Table 4 shows the relative importance of the different Input Feature Dataset Variables used for prediction in the RFC.

It is clear from the feature importance rankings that Input Feature Dataset Variables which directly relate to elevation were the most important for prediction. The top five variables in terms of importance were DEM derivatives, with variables coming from the DEM dataset marginally more important than those coming from the HAND dataset. GSW mean Euclidean distance also had a significant impact on prediction but GSW mean and GFP mean Euclidean distance had a low impact.

Partial dependence plots were generated to give extra insight into the relationship between each specific Input Feature Dataset Variable and the RFC classification. Figure 5 shows all partial dependence plots generated. The DEM partial dependence plots show a clear trend where lower elevations and low standard deviations correlate positively with a prediction of a correctly tagged flood defence. The HAND mean and HAND standard deviation plots show a similar pattern. It should be noted that these Input Feature Dataset Variables are directly correlated to each other (and thus the importance of any individual partial dependence plot should not be overstated). The GSW mean Euclidean distance plot shows a pattern where a small average distance (sub 10m) indicates a defence is less likely to be correct than a defence with an average distance of 10-30m. The GSW mean plot doesn’t show much of a distinct relationship. The GFP mean Euclidean distance plot shows that a lower distance from floodplains correlates positively with a greater

Input Feature Dataset Variable Feature Ranking Relative Importance Score

DEM Mean 1 0.17

DEM Standard Deviation 2 0.16

HAND Standard Deviation 3 0.16

HAND Mean 4 0.14

HAND Range 5 0.13

GSW Mean Euclidean Distance 6 0.13

GSW Mean 7 0.08

GFP Mean Euclidean Distance 8 0.04

Table 4: Table showing the relative importance of the different Input Feature Dataset

Variables on the classification made by the RFC. The total of the relative importance scores is always 1.

(18)

18

chance of a defence being correct. It also shows that a very large distance from a floodplain correlates strongly with a classification being correct, but this information should be discarded as an outlier as there is a very small amount of data showing this relationship. It should also be acknowledged that this Input Feature Dataset Variable had the lowest feature importance so it should not be treated as important in prediction.

Figure 5: Figure showing the partial dependence plots for every Input Feature Dataset Variable

used. The black vertical marks at the bottom of each plot show how the data is distributed (each mark represents 10% of total data). The Input Feature Dataset Variable is visualised at every value in its range on the x-axis. The blue line indicates how likely the variable contributes to a prediction of correct or incorrect defence at any given value. The Y-axis indicates the partial dependence or marginal importance of the variable at this value, where the value is higher the variable is more likely to contribute to a prediction that the classified defence is correct.

(19)

19

Chapter 4: Discussion

This chapter deals with the implications, validity, and usefulness of this study. It begins with an assessment of the key results from the study, examining the results of the classification of the All Defence Features dataset and how these relate to OSM data accuracy. It goes on to discuss the importance of the different Input Feature Dataset Variables used. Next it assesses the validity of the methodology used in this study and addresses known limitations. Finally, it examines both the wider implications of this research and outlines how it could be built-upon in the future.

4.1. Analysis of Results

4.1.a. Extracted Defence Features

With a total of 128,859 tagged flood defence features OSM offers a significant and potentially very useful dataset of flood defences. As outlined in Section 3.1 of the Results, whilst there was uneven distribution, there was still a significant number of features on every continent (except Antarctica). The potential utility of the OSM dataset should not be underestimated, particularly given that it is ever expanding. Therefore, this study shows that an accurate examination of these features, so that they can be used with confidence for hydrological applications, would be greatly beneficial.

Section 2 of the Results describes that only 2,063 flood defences in the All Defence Features dataset were labelled correctly (1.61% of features). This finding does not align with the make-up of the Classified Defences which had 91 correctly labelled and 273 incorrectly labelled defences (or 25% of defences that were labelled correctly). The key takeaways from these findings are that, firstly, most defences in OSM are incorrectly tagged. Secondly, despite this there still are many correctly labelled defences. And thirdly, the automatic classification using the RFC significantly underestimated the number of defences which were correctly tagged. Note the decision to label dams as incorrect (see Methodology Section 2.2.) may have limited the number of correct features; especially as some dams protect against flash flooding.

This study did not stratify areas by land-use type, nor did it examine the percentage of defences that were correct in particular regions. As such, it proved inconclusive in this respect (any attempt to determine the correct number of defences per region would be more achievable with a better functioning classification model that did not underestimate the number of correct defences). However, interesting patterns in terms of the performance of specific tags were revealed. Most pertinently it was found that the most popular tag, ‘man_made=embankment’ was significantly less likely to be correct than the second most popular tag, ‘man_made=dyke’. Only 1.11% of the defence features tagged ‘man_made=embankment’ were correct, compared to 4.51% of features tagged ‘man_made=dyke’ (see Figure 4).

It is proposed here that this difference is due to ‘man_made=embankment’ being more of a catch-all term. Manual observation showed this tag was often used for features such as smcatch-all compound walls, parts of a road rising to join a bridge or elevated highway and natural rocky embankments. The term ‘man_made=dyke’, was used more specifically to tag actual flood defences.

4.1.b. OpenStreetMap as a Data Source

The study confirms what has been previously found in other research (e.g. Schellekens, 2014; Senaratne et al., 2017), namely that crowdsourced data can be unreliable. This problem manifested in different ways; it was common for different tags to be used to describe the same

(20)

20

type of feature, moreover, there were occasional erroneous tagged features (this could be as simple as someone creating an isolated feature in the middle of nowhere, perhaps to learn how the tagging system works). Thus, it is demonstrated that the OSM data of tagged flood defences is not reliable enough to be used without additional filtering or processing.

The study therefore re-confirms that data quality checks are a necessity in using crowdsourced data (in line with Goodchild & Li, 2012). Given the vast and ever-growing number of features available in the OSM dataset it confirms the findings of Donchyts et al. (2016) that classification of OSM features should be achieved with automatic classification methods. It is argued here that the low percentage of flood defence features identified as correct should not be used to damn OSM as unhelpful as a flood defence dataset. Rather it is potentially very helpful but only with the correct data quality checks (which can be implemented by building on the methods described in this study).

4.1. c. Input Feature Dataset Variables

There was considerable variation in the usefulness of different Input Feature Dataset Variables used (see Results Section 3.3.) (Ronaghan, 2019). DEM mean was the most important variable in prediction with a relative importance score of 0.17 (see Table 4). This was closely followed by DEM standard deviation with a score of 0.16. It should be acknowledged that both variables appeared to be strongly correlated. This makes sense from a geomorphological perspective; for example, in a floodplain both a low mean elevation and a low standard deviation in elevation would be expected. This finding that DEM derivatives show a clear relationship with flood defences (and therefore flooding) relates to the findings of Van de Sande, Lansen and Hoyng (2012) who note the importance of elevation when assessing coastal flood risk.

Both DEM mean and standard deviation showed a similar pattern in their partial dependence plots, with a low mean elevation and standard deviation of elevation, respectively, indicating a greater likelihood of a defence being correct (see Figure 5). This suggests that an area that is both low elevation and relatively flat is the most likely to contain a correctly labelled defence, again pointing to floodplains. It should be noted that this outcome was expected given the criteria for manual classification suggested that defences at high elevation were possibly incorrect.

HAND standard deviation also had a relative importance score of 0.16. The importance of HAND values in relation to mapping flooding is highlighted by Garousi-Nejad et al. (2019). The partial dependence plots for this Input Feature Dataset Variable again indicated that flatter areas with a lower standard deviation were more likely to contain a correct defence. Interestingly this relationship only holds for standard deviations of under 7m, at the value of 7m and above classification as correct became more likely again. Additionally, it should be noted that when values were of 10m or above, HAND standard deviation more positively correlated with defences being correct than DEM standard deviation. This could be because DEM standard deviation was capturing large elevation dips which water does not drain into so are less hydraulically relevant. Following this, HAND mean, with a score of 0.14, showed a very similar partial dependence plot to the DEM variables. Its plot echoed them in indicating greater likelihood of a correct classification at low values. It should be noted that the HAND data was generated in a way that small streams count as a drainage basin (where HAND values are 0) (Yamazaki et al., 2017). These streams may not have much relevance in terms of flooding. It could be that a different dataset taking HAND values only for large streams would be a more useful variable.

The GSW mean Input Feature Dataset Variable did not show a clear relationship in its partial dependence plot. However, GSW mean Euclidean distance showed an interesting relationship where very low values (sub 10m mean) indicated a defence was less likely to be correct than low

(21)

21

values (10-30m mean). This seemed to be due to criteria stating that features beside permanent water and dams, both of which would be expected to show very low values in Euclidean distance from water, should not be considered as flood defences. Note this variable had a low relative importance score of 0.08 so was relatively unimportant in prediction.

The GFP dataset showed a very low relative importance score of 0.04. This is deemed to be related to the fact that the units of scale in the Euclidean distance data used were far too large (one degree of latitude) leading to most tagged defence features having a mean distance value of 0. It is possible that if this error is corrected in future studies this could be an important Input Feature Dataset Variable.

4.2. Effectiveness and Limitations of the Study

Overall, the method used was useful and brought interesting insights. It serves as a good example that should be built upon. However, some clear limitations in the methodology emerged; these limitations can be seen in three distinct areas. Firstly, limitations emerged relating to the OSM features in the Classified Defences dataset; the method for randomly generating these incorporated some trade-offs , the method for tagging these worked better in certain regions than others and the total number of features in this dataset appeared inadequate. Secondly, limitations emerged relating to the Input Feature Dataset Variable summary statistics calculated for both the Classified Defences and the All Defence Features datasets; the chosen 1km buffer size for calculating summary statistics appeared too large, the GFP dataset was incorrectly utilised and additional datasets which may have been useful were not included. Thirdly, limitations emerged relating to the RFC; this was not fully optimised, and it may not have been the most effective machine learning model to use for this type of classification.

Firstly, examining the limitations related to the OSM features used for the Classified Defences. Generation of random points to represent the Classified Defences (these served as the centre for the buffers used to calculate the Input Feature Dataset Variable summary statistics) was carried out in a way where points were generated to be equally spatially distributed across earth’s landmass (see Methodology Section 2.2.). This was valid and helped to ensure that the method was applicable for assessing the entire globe. However, this did lead to significant overrepresentation of defence features in remote areas relative to defences in areas where there was a high concentration of tagged defences, such as Europe and Eastern Asia. The actual manual classification of data was effective (see Methodology Section 2.2.). It was enhanced through Google Earth’s (Google Earth, n.d.) ability to visualise both current and historic satellite imagery. Taylor and Lovell (2012) offer a notable example of successful classification using Google Earth. However, this thesis looks at a much larger and more diverse landmass than Taylor and Lovell; a significant limitation that emerged was that spatial and temporal resolution of data available showed high variability making classification difficult in certain regions.

The modest number of defences manually classified for the Classified Defences dataset, represents another limitation; manual classification volume was limited by time constraints. 384 defences were classified which was most likely inadequate given that the dataset being assessed consisted of over 120,000 features. The optimal size of training data varies on a case by case basis; Han and Kim’s (2019) research outlines an algorithm that can help determine this. Manually classifying more defences, for example 1,000, may have meant that defences classified as correct showed a greater difference in Input Feature Dataset Variable summary statistics compared to defences classified as incorrect.

It should be noted that the input Classified Defences dataset used to train the RFC was unbalanced in that it contained a greater number of incorrect than correct defences which may also have

(22)

22

limited accuracy (even though RFC is well suited to unbalanced data (scikit-learn, 2021)). More (2016) discusses issues that can emerge when using unbalanced data for machine learning. It is possible that measures to ensure a more balanced dataset when increasing the size of the Classified Defences dataset could have been taken. For example, this dataset could have been supplemented by adding correctly tagged defences that have been verified with other reliable sources.

Secondly, limitations emerged in the process of calculating summary statistics for the Input Feature Dataset Variables (see Methodology Section 2.3.). Whilst the summary statistics generated from the Input Feature Datasets were for the most part useful, the choice of a 1km buffer for calculating statistics appears to have been sub-optimal. It could have led to non-hydrologically relevant information, a large distance from the feature being assessed, negatively influencing automatic classification. Again, referring to Tobler’s law; things which are geographically closer together are more closely correlated (Waters, 2016). Improvements could potentially have been achieved through iteratively experimenting with different buffer sizes for calculating summary statistics and selecting the best size. It should be noted such experimentation was limited due to the computational intensity of calculating summary statistics (solutions for this issue are highlighted in Section 4.3. of this chapter). Another clear limitation was the lack of accuracy of the GFP mean Euclidean distance (as mentioned in Section 4.1.c. of this chapter). A further limitation came from the limited choice of datasets used for calculating Input Feature Dataset Variable summary statistics. The datasets used appeared closely correlated; this was particularly true for those that were direct derivatives of elevation. Such correlation made interpretation of the effect of an individual Input Feature Dataset Variable difficult. Moreover, no dataset was incorporated that included geospatial information about built-up areas. It is logical to expect that flood defences are used to protect built-up areas (e.g. Kaźmierczak & Cavan, 2011), so such a dataset could have been useful. Additionally, a dataset of railroads -which are often built on embankments that don’t serve as flood defences- would have been useful for detecting misclassified features (e.g. Global Railways, WFP Logistics Cluster, 2017).

Thirdly, limitations relating to the RFC chosen arose, both in terms of its optimisation and its usefulness compared to alternative machine learning models (see Methodology Section 2.4.). Before these limitations are addressed, it should be noted that the RFC analysis showed promise. Such analysis represents an appropriate way to assess and weigh different input datasets for their importance in prediction (scikit-learn, 2021). Some effort was made to optimise the model used including a sequential process of omitting and including different combinations of Input Feature Dataset Variables. Whilst some combinations showed higher cross-validation accuracy scores than using all variables, they predicted an even lower number of correct defences. A subjective decision to prioritise the combination which predicted more defences as correct -given the apparent severe underprediction of the RFC- was taken.

However, this component of analysis could have been further enhanced by greater hyperparameter optimisations in various capacities. Probst, Wright and Boulesteix (2019) discuss how to optimise Random Forest Models. A methodical iterative approach to optimising the actual model hyperparameters could have achieved better results. All RFC hyperparameters were left at default except tree depth which was set at 5. Use of different hyperparameters may have both increased cross-validation accuracy scores and lessened underprediction of correctly tagged flood defence features.

Moreover, RFC isn’t necessarily the most suitable machine learning model for the classification of flood defence features outlined in this thesis. RFC was chosen for reasons outlined in the methodology (see Methodology Section 2.4.). However, an iterative process of assessing results

(23)

23

with other machine learning models may have enhanced results. For example, Convolutional Neural Networks are good for classification problems (Keras, n.d.) and have previously been used for problems relating to flooding (e.g. Gebrehiwot et al., 2019).

4.3. Future Recommendations

In considering the implications of this research it is clear it takes a novel approach to improving global coverage of flood defences. As this chapter has demonstrated whilst OSM represents a potentially excellent global data source, a high percentage of its flood defence data is unreliable. This study has sought to remedy this by providing automatic classification of flood defence features, echoing Donchyts et al. (2016) who use similar methods to produce a surface water mask. It has taken promising first steps in achieving this classification and offers a methodology which can be built upon and improved. To this end all code used in this research is freely available on GitHub. Because this study has outlined a framework for addressing such an important issue it is highly recommended that further research is carried out in this field. It is recommended that this research builds directly upon the work already done as opposed to ‘re-inventing the wheel’. There are easy to implement common-sense steps to remedy the limitations of this study and further determine the accuracy of OSM flood defences. The first, as highlighted in the previous section, is increasing the number of manually classified points used for training the RFC. The current methodology for classification is easily repeatable using the freely available software Google Earth and QGIS. The next recommended step is to set a smaller buffer size for calculating summary statistics. The study should be run multiple times with different buffers (e.g. 100m, 200m, 500m) to determine the optimal buffer size.

In terms of the Input Feature Dataset Variables, the GFP mean Euclidean distance should be recalculated to give Euclidean distance at a more precise scale. This step would have allowed the datasets potential to be properly utilised by the RFC, moreover, additional datasets should be added to the study. These should include a data set showing the degree of urban development, such as Schneider, Friedl and Potere’s (2009) study which offers this on a global scale.

Another clear area of improvement would be to better optimise the hyperparameters used when using machine learning to classify the entire flood defence dataset. An iterative approach should be adopted to determine a. the best machine learning model to use and b. which input datasets to use. Moreover, increased optimisation could determine which Input Feature Dataset Variables (e.g. HAND mean, GSW mean Euclidean distance etc) should be included and which could be omitted. In general, using less input dataset variables allows results to be more easily interpreted, and can lead to more accurate classification (Chen et al., 2020).

One potential avenue for development, which would also help solve issues related to (a lack of) computational capacity would be to carry out some or all the analysis on Google Earth Engine (GEE) (Gorelick et al., 2017). GEE is freely available and allows for analysis of large quantities of satellite imagery geospatial data. It uses Google Cloud computing so can process data quickly. GEE could be integrated into the workflow to aid with initial exploration and processing of data. Another common-sense step to take would be to experiment with different ways of selecting flood defences to manually classify. For example, selection could be done at random on all defences (as opposed to an evenly spatially distributed selection). Attempts could even be made to carry out the study on a subset of the data (for example only the data in the continent of Asia) to see if better results can be achieved looking at one continent at a time.

Furthermore, it should be noted that the methods used in this study need not only apply to assessing flood defences or even to hydrological applications. Rather this study has offered a

(24)

24

framework for assessing OSM data quality that can be adapted and utilised for any number of applications that could benefit from OSM’s large and ever-growing geospatial database. Thus, we see this study is not an end point but rather the very first step in providing a robust and repeatable way to further open the vast OSM database to be used for hydrological and environmental science applications.

(25)

25

Conclusion

This thesis has directly responded to the problem of lack of global data of flood defences by adopting a novel approach in using machine learning to automatically assess the accuracy of tagged flood defences in OSM. Through this it has sought to address concerns about the quality of crowdsource data and open this global (and ever increasing) dataset of over 120,000 flood defences for use in global flood models.

The thesis began by extracting relevant tagged flood defences from OSM. It then manually classified a subset of these defences using visual interpretation of current and historical satellite imagery. Following this it calculated summary statistics from four principle Input Feature Datasets (see Table 1). It then used the summary statistics for the manually classified flood defences to train the RFC and classify the entire OSM flood defence dataset.

To this end it has had mixed success. Whilst conceptually the research is strong, the RFC used underestimated the number of OSM flood defences which are correctly tagged. Despite this setback, it has laid excellent groundwork to be built upon. Key recommendations to improve the thesis’s methodology have been outlined. These include manually classifying a greater number of flood defence features, calculating summary statistics for defences using smaller buffers and adding additional Input Feature Datasets.

Using these recommendations and the code for this research, which is freely available on GitHub, future researchers can take a range of steps to optimise and further develop this study. As such it has been highly valuable in beginning the process of enabling the use of potentially the largest global database of flood defence features by hydrological modelers.

Flooding is a massive global problem; such is its scale that it can only be effectively combatted with the right information. Large-scale flood models are impeded by a lack of data on flood defences and this thesis makes key progress towards delivering this data. As such it represents a key step in improving flood models and by extension empowering decision makers to make better choices, thus limiting the economic and loss of life risk posed by global flooding.

(26)

26

References

Bozza, A., Durand, A., Confortola, G., Soncini, A., Allenbach, B., & Bocchiola, D. (2016). Potential of remote sensing and open street data for flood mapping in poorly gauged areas: a case study in Gonaives, Haiti. Applied Geomatics, 8(2), 117-131.

Casas, A., Riaño, D., Greenberg, J., & Ustin, S. (2012). Assessing levee stability with geometric parameters derived from airborne LiDAR. Remote sensing of environment, 117, 281-288.

Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 1-26.

Choung, Y. (2014). Mapping levees using LiDAR data and multispectral orthoimages in the Nakdong river basins, South Korea. Remote Sensing, 6(9), 8696-8717.

Donchyts, G., Schellekens, J., Winsemius, H., Eisemann, E., & Van de Giesen, N. (2016). A 30 m resolution surface water mask including estimation of positional and thematic differences using landsat 8, srtm and openstreetmap: a case study in the Murray-Darling Basin, Australia. Remote Sensing, 8(5), 386.

Du, S., Scussolini, P., Ward, P. J., Zhang, M., Wen, J., Wang, L., ... & Aerts, J. C. (2020). Hard or soft flood adaptation? Advantages of a hybrid strategy for Shanghai. Global Environmental Change, 61, 102037

Garousi‐Nejad, I., Tarboton, D. G., Aboutalebi, M., & Torres‐Rua, A. F. (2019). Terrain analysis enhancements to the height above nearest drainage flood inundation mapping method. Water Resources Research, 55(10), 7983-8009.

Gebrehiwot, A., Hashemi-Beni, L., Thompson, G., Kordjamshidi, P., & Langan, T. E. (2019). Deep convolutional neural network for flood extent mapping using unmanned aerial vehicles data. Sensors, 19(7), 1486.

Geofabrik. (n.d.). Retrieved September 12, 2020, from https://www.geofabrik.de/

Goodchild, M. F. (2007). Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4), 211-221.

Goodchild, M. F., & Li, L. (2012). Assuring the quality of volunteered geographic information. Spatial statistics, 1, 110-120.

Google Earth. (n.d.). Retrieved December 01, 2020, from https://www.google.com/earth/ Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote sensing of Environment, 202, 18-27.

Hall, J. W., Brown, S., Nicholls, R. J., Pidgeon, N. F., & Watson, R. T. (2012). Proportionate adaptation. Nature Climate Change, 2(12), 833-834.

Han, S., & Kim, H. (2019). On the optimal size of candidate feature set in random forest. Applied Sciences, 9(5), 898.

Jongman, B., Ward, P. J., & Aerts, J. C. (2012). Global exposure to river and coastal flooding: Long term trends and changes. Global Environmental Change, 22(4), 823-835.

(27)

27

Jongman, B., Winsemius, H. C., Fraser, S. A., Muis, S., & Ward, P. J. (2018). Assessment and adaptation to climate change-related flood risks. In Oxford Research Encyclopedia of Natural Hazard Science

Jongman, B., Wagemaker, J., Romero, B. R., & De Perez, E. C. (2015). Early flood detection for rapid humanitarian response: harnessing near real-time satellite and Twitter signals. ISPRS International Journal of Geo-Information, 4(4), 2246-2266.

Kaspersen, P. S., Ravn, N. H., Arnbjerg-Nielsen, K., Madsen, H., & Drews, M. (2017). Comparison of the impacts of urban development and climate change on exposing European cities to pluvial flooding. Hydrology and Earth System Sciences, 21(8), 4131-4147.

Kaźmierczak, Aleksandra, and Gina Cavan. "Surface water flooding risk to urban communities: Analysis of vulnerability, hazard and exposure." Landscape and Urban Planning 103, no. 2 (2011): 185-197.

Keras. (n.d.). Keras documentation: Keras Applications. Retrieved March 23, 2021, from https://keras.io/api/applications/

Kulp, S. A., & Strauss, B. H. (2019). New elevation data triple estimates of global vulnerability to sea-level rise and coastal flooding. Nature communications, 10(1), 1-12.

Ma X. (2017) Spatial Data. In: Schintler L., McNeely C. (eds) Encyclopedia of Big Data. Springer, Cham

More, A. (2016). Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv preprint arXiv:1608.06048.

Narayan, S., Hanson, S., Nicholls, R. J., Clarke, D., Willems, P., Ntegeka, V., & Monbaliu, J. (2012). A holistic model for coastal flooding using system diagrams and the Source–Pathway–Receptor (SPR) concept. Natural Hazards and Earth System Science, 12(5), 1431-1439.

Nardi, F., Annis, A., Di Baldassarre, G., Vivoni, E. R., & Grimaldi, S. (2019). GFPLAIN250m, a global high-resolution dataset of Earth’s floodplains. Scientific data, 6, 180309.

OpenStreetMap. (n.d.). Retrieved September 01, 2020, from https://www.openstreetmap.org/#map=16/49.2303/28.4637

OpenStreetMap Taginfo. (n.d.). Retrieved July 28, 2020, from https://taginfo.openstreetmap.org/tags

OpenStreetMap Wiki. (n.d.). Retrieved September 05, 2020, from https://wiki.openstreetmap.org/wiki/Main_Page

Pekel, J. F., Cottam, A., Gorelick, N., & Belward, A. S. (2016). High-resolution mapping of global surface water and its long-term changes. Nature, 540(7633), 418-422

Probst, P., Wright, M. N., & Boulesteix, A. L. (2019). Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3), e1301.

Ranger, N., Hallegatte, S., Bhattacharya, S., Bachu, M., Priya, S., Dhore, K., ... & Herweijer, C. (2011). An assessment of the potential impact of climate change on flood risk in Mumbai. Climatic change, 104(1), 139-167.

Ronaghan, S. (2019, November 01). The Mathematics of Decision Trees, Random Forest and Feature Importance in Scikit-learn and Spark. Retrieved March 29, 2021, from

Mapping Flood Defences: Improving Global Geospatial Coverage of Dykes and Levees