MapAction has been involved in the response to the earthquake that took place in Haiti on 14 August, helping our partners with data processing, analysis and mapping. This has helped those coordinating operational teams to understand what types of aid are needed in different locations and what other organisations are already doing to help. At the time of writing, this work is ongoing.
At the end of August, we scaled up our support to the UN Disaster Assessment & Coordination (UNDAC) and other responding organisations. Two MapAction volunteers traveled to Haiti to provide in-person assistance, supported remotely by our wider team. As well as using their annual leave to do this, both were required to self-isolate for 10 days after returning to the UK, in accordance with COVID rules. We are grateful to them both for their invaluable efforts.
This StoryMap looks at some of the maps that have so far been created during the response to the earthquake and how they have been used to help the situation on the ground.
By Egor Zverev Egor is working with us temporarily through Google’s Summer of Code programme.
How could I apply my programming and data science skills to make the world a better and safer place? I’ve been struggling to figure that out for quite some time, and finally after three years of studying computer science at MIPT in Moscow, I found an opportunity to fulfil my dreams.
Hi, I’m Egor, and I want to write about the impact I am making while working on my Google Summer of Code (GSoC) project at MapAction!
I decided to join the GSoC programme as I felt it was an amazing opportunity to spend my summer working on a real-world open-source project. The programme offered me 202 organisations and over a thousand projects to choose from, but MapAction stood out as the only humanitarian organisation among them, so the choice was obvious to me. I faced some stiff competition as 25 other candidates applied for this role, so I am so grateful for the opportunity to join MapAction in its mission.
My GSoC began with a bonding period, and even that was amazing! I was introduced to MapAction during one of its many training days. I listened to various lectures given by the MapAction team. I was especially inspired by Hannah’s presentation as she is working at both MapAction and UN OCHA (the UN Office for the Coordination of Humanitarian Affairs) where she’s developing an anticipatory action framework. Talking to her was a fascinating part of my GSoC experience as it made me think hard about how I could help solve some of the world’s problems. Following that, I had a week of meeting various people from MapAction. Each encounter was special in its own way. After my first week, I already felt like I was a part of the team, an ideal time to start coding.
I have been working on the data pipeline project: a MapAction tool to automate the acquisition and transformation of data. During the early stages of emergency response, it’s crucial to gather all necessary data as quickly as possible. My goal was to extend the pipeline from three to 22 data products. This will allow for visualisation of much more infrastructure and landscape features etc. After adding the initial five products, I realised that the code required a serious refactoring as it was quite unwieldy and difficult to deal with. During the first stage I managed to fix many local problems and reduced the total amount of code by almost 30%. Going forward, I am planning to redesign the entire pipeline’s architecture and implement a new design. After this I hope to add unit tests to ensure the code is correct.
As most of MapAction’s developers are volunteers who only work for a couple of hours per week, a simplified pipeline will make it much easier for both them and any newcomers to make sense of it and use it. My work has also increased the readability of the code and made future pipeline development much faster.
In summary, not only have I already added many valuable datasets to the pipeline that will allow MapAction volunteers to easily understand the locations of rivers, airports, country boundaries, etc. I am also bringing fundamental changes to the project that will make the life of MapAction’s volunteers much easier. I feel very proud of the impact I am making and it is an honour for me to spend my summer working on this project.
The Heidelberg Institute for Geoinformation Technology (HeiGIT) and MapAction have signed a Memorandum of Understanding (MoU) outlining their plans to collaborate in a number of areas.
Both organisations share a common vision to support humanitarian mapping by providing innovative geoinformation services for humanitarian response, mitigating risk, anticipatory action and economic development. This includes developing up-to-date disaster maps based on OpenStreetMap (OSM).
In order to fulfil these objectives, HeiGIT and MapAction will work together on various activities involving research and development, teaching, outreach and fundraising.
Examples of current and emerging services we plan to jointly develop include OSM analytics such as Humanitarian OSM Stats, which provides detailed statistics about humanitarian mapping activities in OSM, as well as OSM data-quality assessment and improvement. Here, the ohsome quality analyst is of particular interest, which provides end users such as humanitarian organisations with information on the quality of OSM data for a specific region and use-case.
Further tools may include apps for crowdsourcing, data collection, navigation, routing and logistics services and machine-learning-based methods for data processing and enrichment.
We will also share knowledge, with MapAction contributing experiences aligned to HeiGIT’s teaching curriculum, and HeiGIT, in return providing teaching materials and research results to MapAction.
When MapAction triggers an emergency response, the first step is for a team of staff and volunteers to begin what is known as a “data scramble”. This is the process of gathering, organising, checking, and preparing the data required to make the first core maps that emergency response teams will need, which will also be used as the basis for subsequent situational mapping.
Traditionally, the aim was to complete this data collection as quickly as possible, to get as much data as possible that was relevant to the emergency. However, due to the time-sensitive nature of this work, the team is often unable to dissect in detail the different data source options, processes, and decisions involved as they ready the data for ingestion into maps.
What if they weren’t constrained by time during the data scramble? What if they could deconstruct the procedure and examine the data source selection, scrutinise the processing applied to every data type, and explore the ways that these steps could be automated? To answer these questions, the volunteers at MapAction, with support from the German Federal Foreign Office, have been tackling a stepping-stone project leading towards automation, dubbed the “slow data scramble”. We called it this because it is a methodical and meticulous deconstruction of a rapid data scramble as carried out in a sudden-onset emergency.
As part of our Moonshot, MapAction is looking to automate the creation of nine core maps that are needed in every response, freeing up vital time for volunteers during an emergency, and, perhaps more importantly, identifying data issues and gaps well before the onset of an emergency. Towards this end, we have just released version 1.1 of our software MapChef, which takes processed data and uses it to automatically create a map. However, even with MapChef up and running, there is still a large gap in our pipeline: how do you get the data in the first place? How do you make sure it’s in the right state to go into the map? And which data do you actually need?
The volunteer team created and led a project intended to answer precisely the above questions, with the goal of scoping out the pipeline. This would include writing the code for completing the above operations, although not yet packaging things together in a smooth way – that is saved for a future pipeline project.
Selecting the right components
The first step was to determine what data is required to produce the core maps. The volunteers identified a list of 23 ingredients that make up these maps, which we call “data requirements”. These range from administrative boundaries to roads, and from airports to hillshading (a technique for creating relief maps). To complicate matters, each data artefact had multiple possible sources. For example, administrative boundaries could come from the Common Operational Datasets (CODs, distributed by the Humanitarian Data Exchange), the Database of Global Administrative Areas (GADM), or geoBoundaries.
“The scale and extent of data available for just a single country administrative area alone is staggering.”
James Wharfe, MapAction volunteer
Next, the team needed to address how to obtain the data and ready it for further processing. Normally, when volunteers make maps by hand, they go to the website associated with each artefact, manually download it, and tweak it by hand until it is ready to be used in a map. However, with the pipeline this all needs to be automated.
To approach this considerable undertaking, the team divided up the work into small, digestible tasks, meeting fortnightly to discuss progress, answer each other’s questions, and assign new tasks. This work continued diligently for seven months, at the end of which they had a functional and documented set of code snippets capable of automatically downloading and transforming the data required for all artefacts.
There were numerous challenges along the way that the team needed to overcome. Understanding the differences between the various data sources proved a significant hurdle. “The scale and extent of data available for just a single country administrative area alone is staggering,” noted volunteer James Wharfe. (Indeed, this data landscape is so complex that it merits its own post – stay tuned for a blog about administrative boundaries as part of our upcoming “Challenges of…” series.)
One particular data source that seemed to crop up everywhere was OpenStreetMap (OSM). Almost all of the data requirements in the slow data scramble are available from OSM, making it a key data source. However, given the sheer detail and size of the OSM database – 1,37 terabytes as of 1 Feb, 2021(source) – there are several difficulties involved when working with the data.
For the download step, the team decided to invoke the Overpass API, and create a Python method to abstract the complex query language down to some simple YAML files with OSM tag lists. Next, the downloaded data needed to be converted from the OSM-specific PBF format to a shapefile, which is the type of data expected by MapChef. Several solutions for this exist: to name a few, Imposm, PyDriosm, Osmosis, OSM Export Tool, and Esy OSM PBF. For this project, we decided to use GDAL, however, we certainly plan on exploring the other options, and hope to eventually host our own planet file.
Even though the goal of the slow data scramble was not to produce production-quality code, the team still used Git to host their version-controlled code. According to Steve Penson, the volunteer leading the project, “The collaborative and explorative nature of the project meant Git was incredibly useful. With each volunteer tackling significantly different challenges, establishing a strong code control setup made our weekly reviews far easier.”
The team also used the opportunity to extend their Python skills, with a particular focus on GeoPandas, which enables some of the more intricate data transformations that are normally performed by mainstream desktop GIS tools.
Additionally, the group used this work to explore the concept of DAGs, directed acyclical graphs. This term refers to the building blocks of any pipeline: a recipe, or series of steps, that you apply to your data. There are scores of packages available to assist with pipeline development, but to start, the team decided to use a simple workflow management system called Snakemake. Snakemake works by using Makefiles to connect the expected input and output files across multiple pipeline stages. Although, in the end, the team decided it was not the best solution for scaling up to the real pipeline (which is now being developed with Airflow), they agreed that using Snakemake was a great stepping stone to becoming familiar with this key concept.
Finally, before COVID-19 hit, MapAction’s dedicated volunteers were accustomed to meeting in person once a month – a commitment that led to many enjoyable shared moments and close friendships. This positive and much-loved aspect of being a volunteer at MapAction has unfortunately been hindered by the pandemic. Although still conducted fully remotely, the slow data scramble offered the chance to regularly meet, share expertise, motivate and encourage each other, and work together. Volunteer Dominic Greenslade said it well: “MapAction volunteers are amazing people, and the ability to spend so much time getting to further these friendships was a great bonus”.
MapAction has been chosen as one of 202 organisations taking part in Google’s 2021 Summer of Code, a global programme that aims to bring student developers into open source software development. As part of the scheme, which has now entered into its 17th year, students can apply for placement projects from 202 open source entities, with their time paid for by Google. MapAction is one of 31 organisations taking part for the first time.
Since launching, over 16,000 students across 111 countries have taken part by working with an open source organisation on a 10-week programming project during their summer break from studies. Google Summer of Code is open to students who are age 18 and older and enrolled in a post-secondary academic programme in most countries. As MapAction is one of the only companies taking part from the humanitarian charity sector, it’s a great opportunity to highlight the importance of technology advances to our work.
As part of our Moonshot initiative, two students will be helping us with our goal of automating the production of core maps needed in any humanitarian crisis, for 20 priority countries. Being able to automate these maps means essential contextual and reference information about, among other things, the local environment, population and infrastructure, is immediately available when needed in the best possible quality. The students will be working with MapChef, our Python-based map automation tool, and our MVP pipeline framework for automated data acquisition and processing. As our capability grows, we intend to use these systems to identify data gaps at regional levels.
Take a look at our project ideas for the Google Summer of Code. Applications officially open on 29 March and we anticipate a lot of interest.