Building Capacity and Improving Decisions with YouthMappers

Photo courtesy of Project team/Gulu YouthMappers

With contributions from Stella Nacakwa, M.S. Candidate, West Virginia University & Courtney Clark, YouthMappers.

The YouthMappers Chapter at Gulu University, in partnership with YouthMappers, West Virginia University, the Water Point Data Exchange (WPdx) and the U.S. Agency for International Development (USAID)recently launched the Uganda Water Infrastructure Mapping Project (U-WIMP). The U-WIMP project has two complementary goals including:

  • building the technical capacity of Ugandan YouthMappers in both mapping and spatial decision-support analysis, and
  • supporting the Gulu District Water office in making evidence-based decisions through the collection of digitized water data and the application of cutting-edge data analytics.

Water point data collection in Gulu has historically been paper-based, making it challenging to compile, analyze, share, and use the data to inform decisions. The YouthMappers U-WIMP project will provide much-needed data updates as well as pilot an approach to digitize drinking water resource monitoring. Dr. Denis Nono, a Gulu University lecturer specializing in water resources, and advisor to the project, applauded the engagement of YouthMappers students as this will provide a significant resource to the Gulu Water District Office as well as a hands-on learning opportunity.

Data collected through the project will be uploaded and harmonized with existing records on the Water Point Data Exchange (WPdx). WPdx is an online platform for sharing, accessing, and using water point data that currently hosts over 600,000 records from over 50 countries. The WPdx database includes 557 water point records in Gulu District, reported between 2010-2022. WPdx hosts a suite of decision-support tools designed to provide decision-makers with information to optimize limited resources by prioritizing locations for investments including preventative maintenance, rehabilitation, and new construction, and providing updated estimates on drinking water service coverage in rural areas for local communities, host country governments, and watershed authorities. Results from the decision-support tools are visualized on interactive web-maps and summary graphs which can be utilized by a range of audiences. Additionally, the collected data will be utilized in combination with high-resolution satellite imagery to build models which can automatically detect water points.

Following data collection and analysis, the team will host a series of workshops with government, NGO and academic stakeholders to share the findings and explore how the information can be used to inform decision-making processes and work planning in the district.

The project activities will build the chapter student members’ technical knowledge of the water sector and serve to connect these youth with their communities. The project also emphasizes gender considerations through the YouthMappers Everywhere She Maps campaign which seeks to improve the availability of geographic datasets for women’s economic empowerment. 

This project will also raise global awareness among YouthMappers Chapters of the importance of evidence-based decision making and effectively managing water points and water resource allocation, and it will build on WPdx’s current work to improve decisions which ultimately lead to an increase in sustainable access to water services.  

Project Objectives

  1. Build capacity of the Gulu University YouthMappers chapter through introduction to rural water challenges and practical experience in collecting field data.  
  2. Pilot deployment of a team of YouthMappers students utilizing open-source software to collect up-to-date digitized water point data to provide a better understanding of current water access and create a more updated dataset for analysis through the WPdx platform.
  3. Partner with district, regional, and national government authorities to integrate findings from decision-support tools into budgeting and planning decisions. 
  4. Explore the feasibility of using satellite imagery to identify existing water points, both manually and through AI/ML predictive models to help build a more comprehensive inventory which can then be used to help prioritize investments in water point monitoring, preventative maintenance, and rehabilitation to support sustainable service delivery.
Photo courtesy of Project team/Gulu YouthMappers

Project Timeline

An initial set of in-person trainings for the University of Gulu YouthMappers student chapter was held in December 2021 and January 2022 including introductions to rural water, OpenStreetMap, WPdx and a practical session on water point data collection. Virtual trainings have continued throughout early 2022 focused on the application of OpenStreetMap to map and visualize features. Additional details and updates on progress on the project can be found on the OpenStreetMap wiki platform.

Next Steps

A comprehensive data collection effort for two sub-counties in Gulu district is scheduled for early summer 2022. Collected data will be cleaned and uploaded to both OSM and WPdx. Collected data will also be utilized to build a training data set using high-resolution satellite imagery for the development of models which can automatically detect water point locations. Decision-support analyses to estimate current coverage levels, and priority areas for rehabilitation, new construction and preventative maintenance will be conducted through the WPdx platform. The results of the data collection and combined analyses will be shared with government and NGO stakeholders during a collaborative workshop.

 

Celebrating Open Data Day 2022: The Power of Rural Water Point Data for Evidence-Based Decisions

WPdx is excited to continue to promote transparent data sharing and use of open data in the rural water sector through our second annual Open Data Day celebration!

Launch of Decision Support Tools Web App

To celebrate Open Data Day 2022, we are pleased to announce the launch of the new beta WPdx Decision Support Tools web app.

The WPdx Decision Support Tools interactive web app allows users to view and explore available water point data and analytical results from the WPdx+ dataset and suite of decision-support tools.

For more information on the launch of the tools, please visit our related blog post and/or see the links below which provide detailed information on each of the available tools. 

  1. Administrative Region Analysis
  2. Rehabilitation Priority Analysis
  3. New Construction Priority Analysis
  4. Data Staleness Analysis
  5. Functional Status Prediction Analysis (coming soon)

Recognition of Leaders in Data Sharing

Over the past year over 50,000 new water point records have been uploaded to the WPdx platform from 14 different organizations. We want to take this opportunity recognize and celebrate the following entities that have demonstrated their commitment to
transparency and accountability by sharing data with the WPdx platform.

Interested in sharing data with WPdx? Please see here for more details or contact info@waterpointdata.org with questions.

Direct Contributors to WPdx

  • Inter Aide
  • IRC WASH
  • Uganda Water Project
  • USAID Lowland WASH Activity, implemented by DT Global
  • Village Water
  • Water4
  • Water & Sanitation for the Urban Poor (WSUP)
  • Water For People
  • World Serve International
  • YouthMappers

Contributors via Open Data Portals 

Africa GeoPortal

  • Grid3

Humanitarian Data Exchange (HDX)

  • iMMAP
  • United Nations Office for the Coordination of Humanitarian Affairs
  • REACH Initiative

Transforming Data to Action

Over the past year, WPdx has continued to work with government and NGO partners in Ethiopia, Ghana, Uganda and Sierra Leone.

During 2022 we continue to work to integrate the results from the WPdx decision support tools to strengthen existing decision-making processes.

Thank you to our generous funders and key partners:

 

 

Thank you to the entities which have shared data with WPdx in the past year:

 

Launch of WPdx Decision Support Tools

New WPdx Decision Support Tools

We are excited to release the new suite of WPdx Decision Support Tools (v.1.0 beta).

The WPdx Decision Support Tools interactive web app allows users to view and explore available water point data and results from the WPdx+ dataset and suite of decision-support tools.

The decision-support tools provide insights on rural basic water services for each available administrative division, recommendations for where water points should be rehabilitated or constructed, an overview of the average age of available data, and predictions of likely water point status (coming soon). The results from these analyses can provide decision-makers with tangible evidence for allocating resources and developing work plans to improve rural water services. 

For more information on each decision-support tool, please visit the following links or open the “Information” pop-up on the web app.

  1. Administrative Region Analysis
  2. Rehabilitation Priority Analysis
  3. New Construction Priority Analysis
  4. Data Staleness Analysis
  5. Functional Status Prediction Analysis (coming soon)

Please check out our detailed WPdx User Guide for more information about how to use the entire WPdx platform.

WPdx Decision Support Tools Quick Guide

1. Focus. The landing page of the app showcases the entire WPdx+ dataset. From here, users can scroll to view data and results from the global dataset or zoom into geographies  of interest by selecting ‘Filter by Region’ from the header bar. Depending on the country, users can filter all the way down to the Admin 4 level. 

2. Filter. Users can also filter available data by water point source, water point technology or management type if they are seeking information on specific types of water points. Please note that the analyses are conducted on the comprehensive dataset, not on the filtered view.

3. Explore. Users can select the desired decision-support tool from the drop-down menu and view the results from each tool for their geography of interest. Results are available for download in CSV format. Select Download data from the menu in the upper left corner and choose the results file.

4. Download. Results are available for download in CSV format. Select Download data from the menu in the upper left corner and choose the results file.

Example Use Cases

View Water Points

The View Water Points tools allows users to explore all available functional and non-functional water points available in the WPdx+ dataset. Users can choose to filter by region to a specific country, district, or sub-district of interest, and click on individual water points to learn more about that point. Additional filtering options allow users to view water points based on source, technology, and management.

Administrative Region Analysis

The Administrative Region Analysis Tool provides an overview of the rural population with access to basic services, without access to basic services, and uncharted (i.e., data is not available in WPdx to determine access for these populations) for each available administrative level.

  • Rural Population with Basic Access: Population within 1km of a functional water point
  • Rural Population without Basic Access: Population within 1km of a non-functional water point (but not within 1km of a functional water point)
  • Uncharted Rural Population: Population for which no data on water services is available in WPdx. These populations may be without basic access or basic services may exist, but data has not been shared with WPdx.

Users can view chloropleth maps, which provide administrative regional analysis for the percentage of the rural population With Basic Access, Without Basic Access and which are Uncharted.

Illustrative Uses

  •  Prioritizing administrative divisions for budget and resource allocations
  • Identifying target administrative divisions for interventions
  • Evaluating equity

Rehabilitation Priority Analysis

The Rehabilitation Priority Tool provides recommendations for which non-functional water points should be considered for rehabilitation and repair. The tool also provides insights on which water points are critical in that there are limited nearby alternatives and which water points are being over-utilized. Results can be viewed and filtered based on:

  • Potential population that would regain access if point was repaired (default)
  • Total population within 1km of water point
  • Crucialness of water point (i.e., are there alternative water points nearby)
  • Pressure on the water point (i.e., is the water point over or under-utilized)

Illustrative Uses

 Prioritizing which water points to rehabilitate

  • Highlighting areas where there are limited alternative water points available
  • Understanding which water points are over- or under-utilized
  • Benchmarking rehabilitation needs to inform district budgets and workplans  

New Construction Priority Analysis

The New Construction Priority Analysis Tool evaluates all possible locations where a water point could be constructed in a given administrative area and evaluates how many people that are not near an existing water point (regardless of functionality) could gain access if a water point was constructed in that location.

 

Illustrative Uses

  • Identify locations to construct new water points
  • Evaluate the relative benefit of new construction compared to rehabilitating existing water points
  • Provide insights on potential data gaps which could be filled by uploading data to WPdx

Data Staleness Analysis

The Data Staleness Analysis provides a relative measure of the average age of data available from the WPdx+ dataset. 

Illustrative Uses

  • Identifying areas for targeted data sharing outreach
  • Selecting areas for focused data collection
  • Ensuring a clear understanding of the age of data available for other analyses

Questions and Feedback

Please contact info@waterpointdata.org with any questions.

Interested in sharing data with WPdx? Please see here for more details.

Global Water Challenge Announces the Development of WHdx – an Open Data Exchange to Improve WASH in Healthcare Facilities

Global Water Challenge and the Water Point Data Exchange (WPdx) are excited to announce the launch of the development of a water, sanitation, and hygiene (WASH) in health care facilities data exchange platform. Please see the press release for additional details.

The new platform will be a critical resource for governments, NGOs, and companies to close the gap of 1 in 4 healthcare facilities without basic water services. The platform will be developed in partnership the Millennium Water Alliance and funding from the Conrad N. Hilton Foundation. The WASH Health Facility Data Exchange (WHdx)
platform will support decision makers to improve health services through optimized water, sanitation and hygiene (WASH) investments.

According to WHO/UNICEF 2020, globally, 1 in 4 healthcare facilities lack basic water services, impacting more than 1.8 billion people – worsened by large gaps in sanitation, hygiene, and waste management services. As a result, healthcare providers are unable to provide quality patient healthcare and put themselves at risk of infection, a reality further intensified during the COVID-19 pandemic. Given the often-limited resources available, health and WASH leaders must prioritize which facilities receive improvements even when they lack a clear understanding of the gaps.

WHdx will harmonize healthcare facility WASH data into a singular, publicly available dataset through the establishment of a data standard, providing unique data analysis and decision-making tools for both the water and health sectors. Furthermore, WHdx will be able to provide WASH service records from individual health facilities over time and compare health facilities across geographies from village to country-levels, showing locations of greatest need, problematic issues, and recommendations for highest impact interventions.

Building on the Water Point Data Exchange (WPdx), the world’s largest rural water open data platform with 600,000 water point records from over 80 organizations across more than 50 countries, development of the WHdx platform is a collaboration between WASH and health sector experts to ensure that consistent, user-friendly data is readily available for evidence-based decisions.

The WHdx platform will be guided by a working group including Catholic Relief Services, Centers for Disease Control and Prevention (CDC), Emory University, Helvetas, the Safe Water and AIDS Project (SWAP), Millennium Water Alliance, and Global Water Challenge. The process of selecting standard parameters for the platform is currently underway.

 

 

 

WPdx Launches New Rehabilitation Priority Tool

In an ongoing effort to support improved rural water access investment decision-making, WPdx  announces the launch of its updated Rehabilitation Priority Tool which enables users to immediately identify specific water points for prioritized rehabilitation or repair based on population.

The input for this updated analytical tool is the new WPdx+ dataset, a further enhanced and refined version of the original WPdx dataset (WPdx-Basic) which includes additional data cleaning and processing steps for more robust analysis.

Rehabilitation Priority Tool Overview:

  • A series of geospatial population-based analyses to prioritize water points based on potential impact.
  • Additional parameters to consider when prioritizing areas for rehabilitation, including:
    • Population within 1km – Total population within 1km of the water point
    • Users who would gain access – Estimated number of people who would gain access if a currently non-functional water point was rehabilitated. Population assigned to water point considers the existence of functional water points within a 1km radius. Populations are assigned based on relative distance between each population grid cell and the water points.
    • Likely current users – Estimated number of people who could be currently using a working water point. Population assigned to water point considers the existence of functional water points within a 1km radius. Populations are assigned based on relative distance between each population grid cell and the water points.
    • Crucialness score (0-100%) is the ratio of potential users to the total local population within a 1km radius of the water point. Crucialness provides a measure of water system redundancy. For example, if there is only 1 water point within a 1km radius, the water point crucialness score is 100%, meaning that there are no nearby alternatives. If there are two functional water points within 1km, the crucialness score for each point will be ~50% indicating there is some redundancy in the system, so if one water point is broken down, users have an alternative water point available. For non-functional water points, the crucialness score shows how important the water point would be if it were to be rehabilitated. See example here.
    • Pressure score (0-100%) is calculated based on the ratio of the number of people assigned to that water point over the theoretical maximum population which can be served based on the technology. If a point is serving less than the recommended maximum, the pressure score will be less than 100% (i.e., 250/500 = 0.5). If a point is serving more than the recommended maximum, the pressure score will be over 100% (i.e., 750/500 = 150%). The following recommended maximum values (extended from Sphere Guidelines) are currently in use:
      • 250 people per tap [tapstand, kiosk, rainwater catchment]
      • 500 people per hand pump [all hand pumps]
      • 400 people per open hand well [rope and bucket]
      • 1,000 people per mechanized well

Quick peak:

6 key tool features and options:

 

1. Users can filter based on country and administrative division name down to the administrative division 3 (adm3) level.

 

 

2. Users can filter water points by source (borehole, shallow well, spring, etc.), technology (handpump, mechanized pump, etc.) and management (community management, direct government operations, etc.)

3. The Top Water Points table shows the top 15 water points which would be recommended for priority consideration.

  • The default setting will show priority based on number of ‘Served Pop.’
  • For working water points, ‘Served Pop.’ represents ‘Likely Current Users’ and for non-functional points, ‘Served Pop.’ represents ‘Potential users who could regain access’.
  • Users can also click on ‘Population within 1km’, ‘Crucialness’ or ‘Pressure’ and the table will be updated to show the priority for each of these parameters.
  • Users can select to show/hide functional points and points in urban areas, and the table will update to reflect these choices.
 
 

 

 

4.  Users can select options to show/hide different layers, including functional points, population data and roads/buildings. Key options available in top selection bar, with additional options in Settings.

 

 

5. The Legend describes the different visualizations possible through various Settings selections.

 

 

 6. Users can download the full table of results by selecting ‘Download Data’.

  • If you have filtered to a specific location, all data in that administrative area will be included in the download.
  • If you have zoomed in to a sub-area of interest, the download will include all visible data or all filtered data.

Please feel free to ask questions and provide feedback on the new tool.

Introducing WPdx+

The Water Point Data Exchange (WPdx) is pleased to announce the launch of a new analysis-ready dataset, Water Point Data Exchange Plus (WPdx+). WPdx+ is further enhanced and refined version of the original WPdx dataset, now known as WPdx-Basic.

Through the online data playgrounds, all users can:

  • Sort and filter data
  • Create custom own sub-datasets based on location or other parameter of interest
  • Visualize data using charts, graphs and simple maps
  • Download/export data

The WPdx+ dataset is focused on a subset of countries for which WPdx has enough data for the decision support tools to be activated. The tools and dataset enhancements can be made available for any country if a representative dataset can be shared with WPdx.

The WPdx+ dataset is the input for the new suite of decision-support tools which are under development. Please check out the recently released updated Rehabilitation Priority tool. Additional updates to the remaining tools will be released in the coming months.

For more information on how to add a new country to WPdx+, email info@waterpointdata.org with “New Country Interest” in the subject line.

Please see below for a brief summary of the two datasets:

  • All data shared to WPdx is included in WPdx-Basic Global Data Repository.
  • There are five validation/data-cleaning steps which occur during the ingestion process:
    • Ensure that all records contain the required parameters. Records
      that do not contain required parameters are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Check to ensure that points are located within a country boundary
      per GADM boundaries. Points which do not fall within country boundaries (i.e., in the middle of the ocean) are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Formatting of entries for consistency. For example, for the Presence
      of Water When Assessed (#status_id) parameter, the repository will shown as “Yes” or “No”.
    • Addition of ‘clean’ version of country name, #adm1, #adm2, #adm3
      based on provided GPS coordinates and GADM boundaries. ‘Clean’ values are appended to the record, leaving all original data intact.
    • Addition of “water_source_clean”, “water_tech_clean” and “management_clean” columns. These new columns are created using fuzzy matching to organize entries into consistent categories. For more information on the cleaning process, please see here.
  • Focused dataset on countries for which WPdx has enough data for decision-support tools to be activated.
  • Further data processing steps including:
    • Removal of records identified as having location mismatches (i.e., data provided states that record is from Country X, but GPS location is in Country Y)
    • De-duplication for any records mistakenly uploaded twice (exact matches only).
    • Assignment of a WPdx_id which matches water point records shared by different organizations and on different dates, based on GPS location.
  • Addition of external relevant data sources which are used in the water point status predictions models, including:
    • Distance between water point and nearest road (primary, seconday and tertiary), town and city using OpenStreetMap data.
    • Additional external data coming soon!
  • Tabular access to results from advance decision support tools:
    • Rehabilitation Priority
      • Which non-functional water point should be prioritized for repair?
      • Population living within 1km of water point
      • Likely current and potential users
      • Crucialness of water point (are there alternate working points nearby?)
      • Pressure on water point (is the point over or under utilized?)
    • Water Point Status Predictions – which water points are at a higher risk of failure? (coming soon)
    • Construction Priority – which locations should be considered for new construction to reach unserved populations? (coming soon)
    • Measure water access by administrative division – how does coverage vary within different sub-national divisions? (coming soon)
  • Inclusion of additional key external parameters (coming soon)

Share your data with WPDx.. in 30 minutes or less!

Sharing data with WPDx has never been easier. In fall 2020, WPDx completed a major overhaul of our ingestion engine to streamline the process for data sharing. This blog will take you step-by-step through the upload process. In most cases, this will take less than 30 minutes to complete! If you have questions, please reach out to info@waterpointdata.org.

Before you start, please review our Data Submission Policy to ensure that you have the correct permissions to share the data.

The first step is to review the WPDx data standard and compare with your organization’s dataset. The ingestion notes file can help you document how to map your data to the standard which will save you time later in the process.

To upload data, the minimum requirements are for the dataset to include location (latitude, longitude in decimal degrees), presence of water when assessed (functional status), date of data inventory, data source (organization providing the data), and information on either/both the source and technology of the water point. While these are the minimum requirements, we highly encourage organizations to share as many parameters as possible to provide a more complete entry. These additional parameters, such as install year or management are utilized in the predict water point status tool.

Accessing the ingestion engine

Once you know which columns from your dataset you want to share, you are ready to start the upload process. Go to http://upload.waterpointdata.org to access the WPDx ingestion engine.

  • Click on “Login to the System.”
  • Please note, the ingestion engine requires a Google account.

After login, you’ll arrive at the ingestion engine dashboard:

Sharing your data file

There are two options for uploading data:

  • Upload a physical file (.xlsx, .xls, .csv) from your computer
  • Provide a web link to an API endpoint, Google Sheet, Dropbox or other online system

To upload a physical file:

Before you upload the file, please rename the file using the following format:

  • Organization Name_Countries included_Month Year of Data included
  • For example, Global Water Challenge_Uganda_Jan2020

Select the “Source Data” tab

  • Select “+ Upload Data File”
  • Click on “Select File”, browse to your organization’s data file and click “Open”
  • “File Upload Successful” message will appear at top of screen

Share data via weblink

To upload from a weblink, you must  provide a weblink with permissions. You will enter the weblink on the Data Import Workbench page after first providing some basic information about your dataset.

  • For Akvo Flow, request an API endpoint from your program manager. The API endpoint will be used in the direct URL box at the beginning of a processing task. For more details, please see here.
  • For mWater, create a datagrid formatted per the WPDx standard. This creates a permanent URL. Click on “Download as XLSX” and copy the download link. Use this in the direct URL box at the beginning of a new processing task. For more details, please see here.
  • For Dropbox, copy the download link (not the sharing link) to use in the WPDx ingestion engine. Use this link in the direct URL box at the beginning of a new processing task. Select the appropriate format from the dropdown.
  • For Google Sheets, ensure that the document is shared publicly (select “Anyone on the internet with this link can edit” from the share settings). Enter the URL for the Google Sheet in the direct URL box at the beginning of anew processing task. Be sure to select Google spreadsheet from the format dropdown. 
  • For custom data platforms, please contact us to determine how we can best connect.

Start New Processing Task

Select “Processing Tasks” tab

Select “+ New Processing Task”

Task Name and Description

Enter the Task Name in the following format:

OrgName_Country/Region_Month/Year of data

For example, Global Water Challenge_Global_2019

Provide the main purpose for the collected data under Description

Metadata

Complete the metadata prompts to provide a detailed overview of the data within your dataset.

The metadata will be visible on the data page for your dataset within the WPDx data catalog.

Point of Contact

Complete Point of Contact details for dataset.

To protect privacy, one option is to use an organizational level email (i.e., data@name.org) which can be forwarded by your organization to relevant contacts.

Agree to Data Sharing Terms

Check box to agree to Data Sharing Terms

Leave visibility as “Only Visible to Me”

Select: Save & go to Workbench

Data Import Workbench

Select your source file from the dropdown.

Allow data to process (this may take a few minutes). The Direct URL and format boxes will auto-populate.

If there are multiple sheets in your file, make sure the correct one is selected.

Scroll down to continue (the “Data is Processing” message may still appear)

If using a web address, enter directly in Direct URL text box and select the appropriate format option.

For JSON formats, be sure to leave the JSON Path field blank.

Data Structure

If your dataset is formatted to include only the column headers and the data, leave Skip Rows/Columns as “0”

If there are additional rows or columns which should be skipped (i.e., additional headers or title cells) enter the number of rows/columns to skip.

For the sample data shown below, you would enter “2” in Skip Rows. Leave Skip Columns at “0”

Ignored Values

If your dataset includes terms for blank/unknown values which should be ignored (i.e., Unknown, N/A, etc.), please enter those terms in the text box.

Use a comma as a separator between terms. Do NOT include any blank spaces between commas and terms.

For example: “unknown,Unknown,N/A,0,null,blank”

Data Mapping: Getting Started

There are two methods to complete the data mapping process:

Primary method..

  • Using the dropdown menu, scroll to select the column header from your dataset which matches the WPDx standard.
  • Some parameters may pre-populate, especially if your dataset is labeled with the WPDx #titles. Verify these selections.
  • Note: you cannot map the same column to two different standard parameters.

Optional method..

If there is a parameter which is not in your dataset, but for which a common value can be applied to all datapoints, Select “Constant…” from the dropdown.

  • Examples
  • #source – Data Source –> Constant: Name of Org
  • #country_id – Country –> Constant: “UG” or “GH”
  • #orig_lnk – Public Data Source URL –> Constant: URL

Data Mapping: Required Fields

There are 6 mandatory parameters:

  • #lat_deg – Latitude
  • #lon_deg – Longitude
  • #status_id – Presence of Water when Assessed
  • #report_date – Date of Data Inventory
  • #source – Organization providing data
  • #water_source – Water Source AND/OR
  • #water_tech – Water Point Technology

Data Mapping: #lat_deg and #lon_deg

Latitude and longitude must be in decimal degrees in WGS84.

Select the appropriate column header which matches with #lat_deg.

Go the next dropdown and make the selection to match #lon_deg

Data Mapping: #status_id

Select the appropriate column header from the dropdown

Default values include Yes/No. “Unknown” values (see slide 14) will be converted to a blank cell in the WPDx Global Data Repository

If your dataset does not include Yes/No, but instead terms such as “Functional/Partial/Non-functional” select “more settings..” and enter those terms.

True Values = terms which indicate the water point IS functional

False Values = terms which indicate the water point is NOT functional

Do not leave any spaces between terms, just a comma (i.e., Yes,functional)

Data Mapping: #report_date

Select the appropriate column header from the dropdown

The system will automatically detect the format of the dates in your dataset

If there are errors indicated, select “more settings…” and choose a specific format. (This should only be an issue in rare circumstance)

Data Mapping: #source

Provide the name of the organization providing the data.

If your dataset includes data from multiple sources, please map the parameter to the appropriate column header that lists each organization.

Otherwise, the entry for Data Source in the About the Data section will be applied to all uploaded records.

Data Mapping: #water_source & #water_tech

At least one of #water_source or #water_tech must be mapped for the upload to proceed.

Select the appropriate column header/s from the dropdown

If the information is constant for all values, you can instead select “Constant.. “ and enter in the appropriate value in the text box.

Data Mapping: Optional Fields

The “Optional Fields” are not required, but they do help to provide a more robust dataset for understanding the status of the local water sector.

Please map as many of the WPDx parameters as possible.

For any parameters which do not align with your dataset, you can select “No value for this field” (this is the default selection) and go on to the next parameter.

For example, if your dataset does not include any information on payment:

 

Data Mapping: #country_id

Select the ISO two letter country classification code, selected from a list of all ISO country codes.

If your dataset includes entries from different countries, this information should be included in your data file. Select the appropriate column header from the dropdown menu.

If your dataset only includes entries from a single file, you can select “Constant..” and enter a value to be applied to all rows.

Data Mapping: #adm1, #adm2, #adm3

#adm1, #adm2, and #adm3 are official administrative division designations

If you have questions, look at GADM.org (see tutorial on next slides) or statoids.com to determine the appropriate designations.

GADM.org: Check administrative divisions

1. Go to GADM.org and Select “Maps”

2. Click on country of interest

3. Select “Show sub-divisions”

4. This creates a map and a list of first-level subdivisions

5. Click on one of the first level sub-divisions

6. Click on “Show sub-divisions

7. This creates a map and list of second level subdivisions

Data Mapping: #activity_id

Select the appropriate column header from the dropdown

If a locally or globally recognized standardized identification number exists (i.e., a physical well ID number of barcode) within your dataset, please use that column

OR

If your organization has a unique id system which would allow water points to be matched within your organization over time, please use that column

 

Data Mapping: #scheme_id

Select appropriate column header from dropdown

Data Mapping: #install_year

Select the appropriate column header from the dropdown.

Note that this field accepts a four-digit year or a full installation date. Only the year will be extracted from full date entries.

Data Mapping: #installer

Select appropriate header from dropdown.

Data Mapping: #rehab_year

Select the appropriate column header from the dropdown.

Note that this field accepts a four-digit year or a full installation date. Only the year will be extracted from full date entries.

Data Mapping: #rehabilitator

Select appropriate header from dropdown.

Data Mapping: #management

Select appropriate column header from dropdown.

Select the management classification of the entity that directly manages the water point. Example management types include:

  • Direct Government Operation
  • Private Operator/Delegated Management
  • Community Management
  • School
  • Healthcare Facility
  • Other Institutional Management
  • Other

Data Mapping: #pay

Select appropriate column header from dropdown.

Data Mapping: #status

Select appropriate header from dropdown.

Please note that the system can not map the same column to two different WPDx parameters. If you would like to use the same column, please duplicate it in your dataset (and change one of the column headers). For example, it may be useful to use the a duplicated version of your functionality column for both #status_id and #status.

Data Mapping: #orig_lnk

If the data is available via a public link, select ‘Constant’ from the dropdown and enter it so that it can be applied to all rows.

If there is to a public link, leave as ‘No value for this field’

Data Mapping: #photo_lnk

Select appropriate column header from dropdown.

If there is to a public link, leave as ‘No value for this field’

Data Mapping: #fecal_coliform_presence

Select appropriate column header from the dropdown

Default values include Present/Presence and Absent/Absence. If your dataset include other terms, select ‘more settings…’ and enter the terms into the True Value and False Value text boxes.

Separate terms with a comma but do not include any spaces.

Complete associated metadata questions at the bottom of the page (see Water Quality Metadata section for more information).

 

Data Mapping: #fecal_coliform_value

Select appropriate column header from dropdown

Complete associated metadata questions

Data Mapping: #subjective_quality

Select appropriate column header from dropdown

Complete associated metadata questions

Data Mapping: #notes

Select appropriate column from header or apply Constant value is appropriate.

The #notes parameter can be used to enter custom data which the host country government or organization has selected.

For example, some organizations want to track seasonality, additional administrative districts, or some combination.

Multiple parameters can be included by creating a column that includes the parameters of interest, separated by a “;” or “…” delimeter.

Water Quality and Notes Metadata

If you mapped the #fecal_coliform_presence, #fecal_coliform_value or #notes columns, please complete the additional metadata question section.

Once mapping is complete

Select “Save” or “Save and Submit for Approval”

Select Save and Submit for Approval when your data has been fully mapped and is ready for upload

The status in the Processing Tasks tab will now show as “Pending”

An administrator will be notified and will complete the uploading process

Once approved, an email will be sent to the uploader’s email address

If the mapping was not successful, you will see an error message indicating which parameter was not mapped and explanation of why. Once the error has been fixed, you can submit the processing task for approval.

Successful Upload!

Once the data upload has been completed by an administrator, the status in the Processing Task will be marked as “Success”. An auto-generated email will also be sent to the account email address. 

You can view an overview of the dataset in the WPDx data catalog by clicking on the eye icon.

The data catalog dataset page includes:

  • Metadata and contact details
  • Ingestion report – summary statistics of the number of rows uploaded and any errors encountered
  • Link to download source file

Data will be visible on the WPDx data repository within 24 hours.

Need to make changes?

Users can edit their datasets and processing tasks to correct errors or make other additions (i.e., add a new column that was not previously mapped).

To remove data from WPDx, please contact the administrator at info@waterpointdata.org with “Request to remove data from WPDx” in the subject headline. Include the name of the source file and the reason for the removal request.

Source Data: Update Contents or Delete

If you realize you have made an error and/or need to edit or amend an existing dataset, go to the Source Data tab, select ‘Update Contents’ and upload a revised file.

Once the file has been updated, go back to the associated Processing Task and check/edit the Processing Task content and data mapping and hit “Save and Submit” at the bottom of the Data Import Workbench page.

Do not use ‘Update Contents’ to initiate a new dataset upload as this will replace any previously shared data. Instead upload a new file and start a new Processing Task.

Editing a Processing Task

If you want to add/edit the metadata for your dataset and/or make changes to the way that the data is mapped to the standard, select “Edit” from the Processing Task tab.

Make any changes and hit “Save and Submit” at the bottom of the Data Import Workbench page.

An admin will be alerted of your update and will review and process the upload.

Questions?

Please contact: info@waterpointdata.org

Check out our Resources and FAQs

Increasing water point data sharing for evidence-based decision-making in Ethiopia: The start of a journey

Guest Blog by Laura Brunson, Ph.D., Deputy Director, Millennium Water Alliance

Ethiopia is home to 112 million people with more than 81 million living in rural areas. According to a 2017 Joint Monitoring Program report, in rural areas of Ethiopia, only 4% have access to safely managed water services, 30% have basic water service, and 26% have only limited water service, with the rest consuming water from unimproved or surface water sources. Many households spend 30 minutes or longer to obtain drinking water daily. Despite Ethiopia’s achievement of the 2015 Millennium Development Goal on water, there are several unaddressed challenges that hinder safe and sustainable water service delivery for millions of rural Ethiopians.

The One WASH National Program (OWNP) of Ethiopia is a flagship program that has enabled the WASH sector to collaborate more broadly and achieve substantial progresses since its launching in 2013. The evaluation of the first phase of the OWNP (2013-2017) commissioned by Ministry of Water, Irrigation and Electricity (MoWIE) revealed a set of bottlenecks encountered by the water supply sector in Ethiopia. These include: lack of an independent regulatory entity, inadequate involvement and resource for the private sector including microfinance institutions, absence of harmonization between water inventory and other sector data, and the absence of an operational Management Information System (MIS) for data.

Challenges with Data Management

Cognizant of the gap in water point data availability and utilization, a series of workshops conducted through the MWA partners in 2018-2019 in the Amhara Region provided clarity on the type and extent of challenges faced by WASH sector partners due to this data challenge. Some of the major problems associated with the lack of up-to-date water point data included: inability on the side of water actors to make evidence-based decisions on operation and maintenance, increased non-functionality rate of water points, and difficulties with financial resource allocation. Rural water point data is housed in a multitude of formats and places with woreda (district) governments, national government, and NGOs which all have their own sets of data for particular geographic areas that are often outdated.

MWA and its members and partners recognize the urgent need to strengthen the government’s capacity in water supply data management, analysis, and evidence-based decision making. Strengthening the government-led monitoring system is one of the priorities identified for MWA and partners in the on-going Sustainable WASH Program, 2019-2024, funded by the Conrad N. Hilton Foundation. This program operates in the Dera, Farta and North Mecha woredas; MWA members in Ethiopia, via other programs and funders, are working on water in more than 100 woredas.

A Platform for Data Sharing, Access and Analysis

WPDx is a free and open-source platform that serves as a repository for rural water point data and provides decision-making tools to support governments and other stakeholders. Over time the WPDx Working Group developed a short list of standard parameters that are required in any data upload, alongside a much larger list of optional parameters which support a more robust data set. Recent updates to WPDx have resulted in a fast and easy method by which data can be shared and is collection tool agnostic (e.g. data collected in ODK, mWater, Akvo and other forms can all be used). WPDx provides maps and easy ways to search for and access data by organization, location, and other parameters. One of the strengths of WPDx is its ability to compile data from multiple entities or platforms (e.g. government and NGO data from one district) and make all of it available for use. WPDx provides four decision-support tools, three using geospatial analytics and one using machine learning to make status predictions. The three geospatial tools include: assessment of basic water access by district, prioritized locations for implementation of new water points, and prioritized locations where repairs are most impactful. The predictive tool provides insights on the probability that any given water point will be functioning or not.

WPDx can serve as a helpful tool for governments and NGOs as many are seeking to improve their data collection, use and analysis. With quick and easy data uploading, dispersed and fragmented data sets can be combined for easy visualization and then free access to the analysis and decision-making tools. The Government of Sierra Leone has already incorporated WPDx as one of its focus tools and has mandated that all rural water point data collected across the country must be shared into the WPDx platform. The national Ministry of Water Resources has a national directive in place to require the use of WPDx decision support tools in all investment decisions for rural water services. You can see more about the role of WPDx in Sierra Leone here.

A New Partnership

In 2020 MWA and the Global Water Challenge (GWC) developed a partnership with the goal to provide support in response to these identified rural water point data challenges in Ethiopia. As a starting point, MWA had a series of discussions with the Water Development Commission (WDC), which sits within MoWIE, about their challenges and whether or not WPDx could be a useful tool to help strengthen rural water service delivery, functionality and informed decision-making in line with the Sustainable Development Goal target 6.1. The WDC expressed interest in using WPDx and getting support from MWA, its members, and other stakeholders to use WPDx in Ethiopia.

Several resulting actions took place:

  • WDC issued a formal letter requesting NGOs to share rural water point data to WPDx
  • WDC issued a formal letter of support to GWC to collaborate to implement WPDx
  • WDC assigned a focal WPDx person to support these efforts.

Building on this expressed interest and formal requests from the Ministry, MWA provided a series of training opportunities for NGOs and government partners to learn more about WPDx. This first set of trainings included:

  1. Purpose and value-add of WPDx in the water sector
  2. How to easily upload data using the new WPDx ingestion engine
  3. Example from Sierra Leone showing use of WPDx by national government
  4. Noting that MoWIE, via WDC, has approved partnership with WPDx and encourages organizations to share their data to WPDx.

A second training was developed to provide information on the decision-making tools. This training included:

  1. Overview of the available decision support tools
  2. How to use the tools
  3. Discussion on how the decision support tools can be useful to regional and woreda governments in prioritizing locations for new water points or rehabilitation and for monitoring basic service delivery levels.

Training sessions were guided by presentations delivered by MWA with support from GWC and then followed by interactive question and reflection sessions by the participants. Participants were encouraged to share data during the trainings to practice data uploading and then return to their home organizations or offices and share larger recent data sets to the WPDx platform.

Select Lessons Learned from the Training Process

  • The introduction of the WPDx initiative to NGO partners and their willingness to engage in trainings and upload data has demonstrated the desire and readiness of WASH stakeholders in Ethiopia for a robust rural water point data platform for planning and decision-making. The importance of monitoring data for the WASH sector has been highlighted in several workshops convened by the government for some years. Nevertheless, an open-source platform like WPDx that can be used by each and every WASH stakeholder has never been planned or practiced in Ethiopia. The reflection from the trained focal persons about its accessibility and ease of application has indicated potential for continued use and ability to add value.
  • The WPDx initiative and the training provided motivated NGO partners to start compiling the most recent data they have and updating existing water point data. On average, most organizations required about a month to obtain necessary permissions to share data, from government or headquarter offices, clean their data, and upload it to WPDx.
  • The reflection and expressions of demand from the Water Development Commission of Ethiopia implies the potential for WPDx to be aligned with the National MIS for water supply.

Just within a three-month period following the trainings, NGOs and government entities have uploaded more than 20,000 data points.

While this is great progress, particularly during a time of pandemic and political unrest, this is only the beginning. The MWA, GWC, and WDC partnership continues with next steps including another training series for more organizations and government entities, support for a critical mass of uploads in target districts to support use of the analysis and decision-making tools and then development of a case study, and support to increase the geographic scope in which WPDx data is shared within the country. Stay tuned for future updates on the uptake and use of WPDx by National and local government in Ethiopia.

 

Funding for the work discussed in the post was generously provided by the DT Institute and the Conrad N. Hilton Foundation.

Photos: Credit to Tedla Mulatu Millennium Water Alliance. Photos show government and NGO partners engaged in WPDx training sessions.

 

The Millennium Water Alliance is a permanent global alliance of leading humanitarian and private organizations that convenes opportunities and partnerships, accelerates learning and effective models, and influences the WASH space by leveraging the expertise and reach of its members and partners to scale quality, sustained WASH services. MWA’s 20 members work in more than 90 countries around the world. MWA serves as a hub for major programs in Kenya and Ethiopia.

The Water Point Data Exchange is an initiative of the Global Water Challenge (GWC). GWC is a coalition of leading organizations committed to achieving universal access to safe water, sanitation, and hygiene (WASH) and women’s empowerment. With companies, civil society partners, and governments, GWC accelerates the delivery of safe water and sanitation and supports gender equality through partnerships that catalyze financial support and drive innovation for sustainable solutions.

How to use machine learning to predict water point status

Guest Blog by Lars Heemskerk, Consultant for Akvo

< The water point you selected is probably no longer functional > 

If you’re responsible for providing drinking water to as many people as possible, this is the kind of information you want to have access to – especially when you’re hundreds of miles from the water point in question. Thanks to the support of the Dutch Ministry of Foreign Affairs and the Coca-Cola Foundation (TCCF) Akvo, together with WPDx and DataRobot, was able to conduct a pilot in Sierra Leone with machine learning algorithms to automate decision intelligence.

Improving water services in Sierra Leone

 As of 2012, the government of Sierra Leone has been monitoring water points through a large-scale national inventory, as well as small-scale monitoring efforts by NGOs. Data has been collected on the functionality, year of construction, type of pump, type of management, distance to village, etc. to calculate the percentage of the population that have access to drinking water. This data provides a global insight into the state of WASH infrastructure in the country and, because Sierra Leone is at the forefront of African countries sharing data openly, a lot of this data is available on platforms like WASH data Sierra Leone and WPDx.

 Unfortunately, this data is not regularly enriched, so the information on these portals is quickly outdated and therefore less reliable. Thanks to various efforts from WPDx, among others, the importance of regular uploading of data has been emphasised in the National Digital Monitoring Approach. The recent signing of a letter by the director of the Water Directorate, which states the mandatory sharing of water point data by every organisation or government body in Sierra Leone, is an indispensable step in this process.

 In addition, Akvo, in collaboration with WPDx and the Ministry of Water Resources, has started to explore how more can be done with the existing data, at local and national level, to generate data-driven insights that can improve decision making. Machine learning is relatively new in the water sector, but can be applied very well to historical data to predict outcomes and uncover patterns not easily spotted by humans.

Setting up the foundation for advanced analytics

Machine Learning is about recognising patterns in data. Using data collected in the past, machine learning techniques can recognise patterns and make predictions for the future. This can be applied to historical water point data, too. 

Based on the available data, and with the help of DataRobot software, we have been able to determine a number of indicators that are related to the predictable metric – functionality. By combining functionality with other indicators, such as district, county, management, age, water source, and type, the system can teach itself to predict the probability that a water point will be functional now or in the future. The tool is made available on the Water Point Data Exchange.

By using the DataRobot platform, we were able to predict which water points are going to break with an accuracy of 85%. By applying these machine learning models, it’s possible to determine which broken water point, out of thousands, should be fixed first to help the most people. On top of this tool, decision makers can also make use of other geospatial information services (GIS) tools that have been developed to analyse water points to determine high impact locations for rehabilitation, construction and estimating basic water coverage aligned with the Sustainable Development Goals (SDGs).

Pilot training and support 

Implementing these new advanced analytics techniques, it is just as important to involve and train stakeholders. This is not an easy process because it involves major process changes and the involvement of various governmental and non-governmental organisations. In 2019, the Global Water Challenge already held a three day training session with all district water directorates to discuss the transformation of the WASH sector to improve efficiency through the use of data. Following this session, a meeting was held to brief NGOs on the WPDx approach. Building on this general training, more focused training was provided to district mapping officers and NGOs. The next step was to set up a plan on how to use and implement the decision support tool. At the moment of writing this blog, a draft plan has been created and a workshop has been organised to dig deeper into how the decision support tools can contribute to safe water for all in Sierra Leone.

The need for more accurate data

Beside the involvement of NGOs and government bodies, reliable and up-to-date data is crucial for making correct predictions. Since the last national inventory dates back to 2016, it’s important that the water points are structurally monitored. With the letter from the above mentioned Water Directory, there will be a boost of more recent data which will certainly have a positive effect.

We also encourage stakeholders to test whether the machine learning predictions correspond to reality. This can be done on a small scale. There are talks with the Ministry of Water Resources and InterAide to carry this out and test whether the outcomes of the tools are correct and usable in the daily life of decision makers. We would like to continue with this in 2021, in order to prove the power of advanced analytics, but above all to provide drinking water to as many Sierra Leoneans as possible.

Celebrating Open Data Day 2021: The Power of Rural Water Point Data to Improve Decisions

WPDx is excited to promote transparent data sharing in the rural water sector through our first Open Data Day celebration!

Bringing Together the Pieces of the Puzzle

Across the WASH sector there is growing recognition that regular monitoring, data collection, and evidence-based decision making can improve water access program outcomes, and many organizations and governments are working diligently to collect data in their areas of operation. However, unless data is openly shared, entities are only able to utilize their own data to make decisions – which is only one piece of the puzzle.

Sharing data through the WPDx platform enables the puzzle pieces to come together to show the entire landscape and provide a more comprehensive understanding of the water sector.  This link shows how WPDx works to harmonize data regardless of which organization collected the data or which collection platform was utilized. The harmonized dataset, available on the WPDx Data Repository, also serves as a starting point for robust decision-support analysis.

New Predict Water Point Status Tool… Coming Soon

To demonstrate the power of using open data to improve rural water decisions, we will soon be launching an updated version of our Predict Water Point Status tool. The results from this tool provide insights about which water points may break down in the near future, which can be used to inform decisions around preventative maintenance, increased monitoring and resource allocations. We are working on similar updates to our remaining tools which will launch later in 2021.

Recognition of Leaders in Data Sharing

To mark our first celebration of Open Data Day, we take this opportunity to recognize the entities that have demonstrated their commitment to transparency and accountability by sharing data with the WPDx platform, contributing over 40,000 new water point records from 28 countries in the past year. Special recognition goes to the following organizations: 

Countries with the most water point records uploaded in the last year

  • Ethiopia
  • Sierra Leone

Governments with demonstrated national commitment to collecting and using WPDx data for decisions

  • Ministry of Water Resources, Sierra Leone
  • Water Development Commission in the Ministry of Water, Irrigation and Energy, Ethiopia

Government Agencies that have shared most data in the past year

  • Ministry of Basic and Secondary School Education of Sierra Leone (in partnership with the Ministry of Water Resources of Sierra Leone)
  • Dera, Farta, and North Mecha Water and Energy Offices (Ethiopia)

Organizations that shared the most data in the past year

  • Community-Led Accelerated WASH program (COWASH)
  • Living Water International

Organizations that shared data from the most countries in the past year

  • Living Water International
  • WaterAid

Organization that demonstrated their commitment with automated updates

  • Ugandan Water Project

 

Thank you to our generous funders and key partners

 

 

 

Entities which have shared data with WPDx in the past year