Timeseries of maps of environmental variables
Space-Time Paths
Example: airpollution
id age sex …
1 12 m ..
2 11 m ..
3 45 f ..
4 4 f ..
id work class …
1 12 m ..
2 11 m ..
3 45 f ..
4 4 f ..
id age sex … e1 e2 ..
1 12 m .. 12.4 32.5 2 11 m .. 11.7 1.8
3 45 f .. 0.9 2.8
4 4 f .. 0.45 1.9
id e1 e2 …
1 12.4 32.5 ..
2 11.7 1.8 ..
3 0.9 2.8 ..
4 0.45 1.9 ..
id age sex …
1 12 m ..
2 11 m ..
3 45 f ..
4 4 f ..
Cohort enriched with personal Exposure
Creating the PM10 map
𝑍𝑍𝑃𝑃𝑃𝑃𝑃𝑃 = 23.7 + 2.2 𝑍𝑍𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑃𝑃 + 6.7 𝑍𝑍𝑝𝑝𝑝𝑝𝑝𝑝𝑡𝑃𝑃𝑃 + 0.02 𝑍𝑍𝑡𝑡𝑝𝑝𝑡𝑡𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑡𝑡𝑟𝑡𝑃
Number of vehicles within 500 m
Number of people living
within 5 km
Total road length within 50 m (m) PM10
(microgram/m3)
transect
North-South Transect
1
2
3 4
Time Series
1 2 3 4
NO2
PM10
Other variables:
- noise
- radiation
- acces to health care
- food exposure
- urban heat islands - pollen
- access to parks, woods, etc
Time
Cohort
id age sex …
1 12 m ..
2 11 m ..
3 45 f ..
4 4 f ..
Example: modelled space-time path
Validation of exposure along modelled routey = 0.491x + 4.6794 R² = 0.35 p-value = 0.00
0 5 10 15 20 25 30
0 5 10 15 20 25 30 35
Fastest Route Simulation NO 2 Exposure (μg / m3 · h)
Observed route - NO2 Exposure (μg / m3 · h)
Left: space-time paths calculated with google routing API. Right: exposure along modeled route vs exposure along observed route. (Calculations: M. Geijer, data: D.
Ettema).
Accident risk exposure along a walking trip
(brighter cells receive a lower weight). M. Helbich et al. 2016, Health & Place.
Assessment of linearity of the association
between air pollution and diabetes (simulated data). M. Strak et al. 2017 (unpublished).
Density of supermarkets within 1000 meters per cell, used to calculate accessibility to
healthy food. M.Helbich et al 2017, Applied geography.
Identify health impacts
●
●
●
●
●
●
●
●
●
●
Personal exposure
Health
Examples health impact
Scientific Framework
Aggregate environmental
variable along space-time path Upload location information of
individuals in cohort
X
Y
Age Sex Education Household income BMI Smoking Alcohol use
Full population 19−30 years 31−45 years 46−60 years 61−75 years > 75 years Male Female Primary or less Lower−secondary Higher−secondary University < 10,000 euro 10,000−15,000 euro 15,000−20,000 euro 20,000−30,000 euro > 30,000 euro Overweight Obese Normal range Current Former Never Current Former Never
0.9 1.0 1.1 1.2
OR (95% CI)
PM2.5
Introduction
Our environment has a considerable impact on health. For instance, air pollution increases the
risk for cardiovascular disease, a warm and humid climate supports the spread of vectors causing
malaria, and green space may improve mental health. Understanding these health impacts is a massive challenge as it requires quantifying the exposure to these environmental variables for
each individual in a population. The Global and Geo Health Data Centre (GGHDC) is taking this challenge by providing a web service that enriches population data with information on personal
exposures to the environment. We combine
high performance geocomputation and spatial data analysis to calculate personal exposures for
individuals using their location data, either directly measured using mobile devices or by agent-based simulation modelling. Exposures are calculated
from archived national and global environmental information (up to 5 m spatial and 1 h temporal resolution) or data generated on the fly using
environmental models running as microservices.
GGHDC Team
Human Geography & Planning
University Medical Centre Utrecht Physical Geography (PCRaster)
ITS
Institute for Risk Assessment SURFsara
GLOBAL GEO HEALTH DATA CENTER
Core team: Rick Grobbee, Martin Dijst, Bert Brunekreef, Derek Karssenberg, Ilonca Vaartjes, Folkert-Jan de Groot, Kor de Jong, Oliver Schmitz, Ivan Soenario, Maciek Strak, Harm de Raaff, Leendert van Bree, Peter Hessels, Dick Ettema
Contributions: Michiel Geijer, Gerard Hoek, Anna-Maria Ntarladima, Maartje Poelman, Mei-Po Kwan, Monique Simons, Carlijn Kamphuis, Marco Helbich, Amit Birenboim, Maarten Zeylmans van Emmichoven
Funded by Utrecht University
info@globalgeohealthdatacenter.com
Software architecture: web app + services GGHDC Software Architecture
Cohort
Personal exposure
Environmental Modelling Information Service (EMIS)
HPC, data, models, task-queue, … Portal
server Portal
client
Portal web app
’Thin’ web app: first client of the functionality provided by EMIS
Docker Swarm
Cooperating services, deployed in Amazon EC2
internet internet
Possible future clients:
- Mobile apps for e-health
- Python package for scripting
Environmental datasets:
- 6 air pollution models, 25 predictors
- 5m grid and hourly time step nationwide - 1.5 billion raster cells, 15Gb per timestep - 125TB per model per modelled year
Geocomputation on HPC facilities:
- parallel algorithms
- distributed algorithms - parallel I/O
3rd party libraries
Boost, (parallel) HDF5, HPX, …
Core: modelling support (C++)
LUE data model, parallel algorithms, frameworks for temporal modelling, error
propagation, agent based modelling, …
API: Python binding User applications
Agent- and field-based models
Modelling software stack
• Target user is a domain
expert, not a programmer
• Target domain undefined:
generic building blocks
• Target platform is any platform, including HPC clusters
Modelling environment
Cartesius: the Dutch supercomputer https://userinfo.surfsara.nl/systems/cartesius