Integrating Governance Factors into WPdx

Governance is recognized as a key aspect of sustainable rural water services. The USAID Governance Research on Rural Water Systems (GROWS) activity was designed to identify and disseminate innovative governance and private sector-derived models and tools to improve water services to help accelerate eliminating extreme poverty in sub-Saharan Africa. GROWS was targeted to USAID Democracy, Human Rights and Governance (DRG) Officers at Missions across Africa to support increase cross-sectoral programing, leading to the enhancement of the delivery of rural water services. The final product of GROWS is a comprehensive toolkit which includes key research findings and program design support for USAID.

A key finding from GROWS was that the effective sharing of data at the community scale had the potential to improve the governance of local drinking water systems, due to greater levels of transparency, accountability and increased the trust between water users and providers. An extension of this finding is that sharing data from individual community water systems can strengthen district and/or regional governance by providing a comprehensive understanding of what is and is not working at different geographic scales.

The Water Point Data Exchange (WPdx) provides a free and open user-friendly platform for data sharing, access, and use, with the ultimate goal of supporting evidence-based decision-making. Through GROWS, WPdx added specific governance-oriented parameters and features to the WPdx+ dataset and decision-support tools app. This includes the integration of new governance-oriented datasets into our status prediction models. These datasets include:

Additionally, WPdx conducted increased data cleaning on the #management parameter to created standardized categories for management including: community based management, direct government operations, private/delegated management, health care facility management, school management, religious institution management, other institutional management, no management and unknown. Users can filter to view the results from the WPdx decision support tools by management type by selecting “Filter By Attribute” from the menu, as shown below:

The comprehensive WPdx user guide, which provides additional details about the platform is available here.

Comparing water point based and household survey based access estimates

Figure shows how DHS regions in Uganda (labeled) compared to WPdx regions (sub-counties from GADM) in red

What data should I use? Is the data valid? These are some of the driving questions facing decision makers around the world. Multiple sources of data are available to decision makers on the state of water access and services. There is relatively strong agreement that reliable data for decision making is needed. At the same time, it is not always clear which data sources are both available and appropriate to answer the questions about where and how to invest resources in water services and how to appropriately target the poorest.

With funding from the United States Agency for International Development (USAID) Governance Research on Water Systems (GROWS) activity, a study was commissioned to determine how water point coverage estimates based on publicly available data from the Water Point Data Exchange (WPdx) compare and contrast with the official Joint Monitoring Programme of WHO/UNICEF (JMP) figures. The goal is to provide recommendations about how these different estimates could be used in tandem and to identify their respective strengths and limitations. The study was carried out by Nick Dickinson of WASHNote.

Comparing between metrics and triangulating different measured results can be useful to validate conclusions and inform decision-making. This study finds a relatively strong correlation and linear trend between these two estimates in four countries that suggests that using household surveys and water point inventories together can be useful to decision makers who may only have one or the other data sources or may want to validate the conclusions from one against another.

The full paper has been submitted for peer review. A pre-print is available here. A link to the final paper will be added here once the paper has been accepted.

Acknowledgements from the Author, Nick Dickinson of WASHNote

This study would not have been possible without the contribution of open data on water points by data providers to WPdx. Members of the Water Point Data Exchange (WPdx) working group reviewed both the proposal and findings of this work. Katy Sill of WPdx first recognized the potential of the work, provided invaluable feedback, and responded quickly with explanations about how the WPdx algorithms work while investigating and delivering improvements to the tools when required to make this comparison possible.

Similarly, the National Statistics Offices (NSOs) and the Demographic and Health Surveys (DHS) Program of the United States Agency for International Development (USAID) made it possible to use household survey data from different countries. I would like to thank the Joint Monitoring Programme of WHO/UNICEF (JMP) team for sharing country, regional and global estimates of progress on drinking water, sanitation and hygiene (WASH) in households as well as the estimates for the sub-indicators required to generate those estimates, for providing clarifications about the JMP methodology, and for taking time to reflect on study findings.

This material is based upon work supported by USAID under award number 7200AA18CA00033.

Water point data and governance in Tanzania

With contributions from Herbert Kashililah, Chair, Tanzania Water and Sanitation Network (TAWASANET)

The United States Agency for International Development (USAID) Governance Research on Water Systems, or GROWS, is a research activity that focuses on rural water services as a lens through which to explore the interconnections between good governance and private sector engagement because of the critical role that water plays in community health and economic development.

A key finding from GROWS was that the effective sharing of data at the community scale had the potential to improve the governance of local drinking water systems, due to greater levels of transparency, accountability and increased the trust between water users and providers. An extension of this finding is that sharing data from individual community water systems can strengthen district and/or regional governance by providing a comprehensive understanding of what is and is not working at different geographic scales.

In February 2022, GROWS hosted a governance and water data workshop in Tanzania including participants from government, NGOs and development banks to explore current challenges and opportunities related to collecting, sharing, and using data at scale to improve rural water decision-making. The workshop included an active discussion focused on current data collection and analysis approaches as well as an introduction to the Water Point Data Exchange (WPdx) as a free and open platform which can enable data sharing, access and use.

Background

Water point mapping (WPM) to monitor the status of community water supplies has been part of the Tanzanian water management approach since 2004. Initiated by WaterAid and adopted by other key NGOs in the Tanzanian water sector (SNV, Plan International and Concern Worldwide), a widescale water point mapping exercise was held between 2005 and 2009 and resulted in the mapping of 55 of 132 rural water districts. Subsequently in 2010, the Government of Tanzania adopted WPM as its main tool for monitoring rural water supply and developed an online database, the Water Point Mapping System (WPMS). Between 2011 and 2013, the Ministry of Water mapped all districts in Tanzania mainland and data was uploaded to the WPMS. Since 2013, data has been updated on a district-by-district basis by government water managers.

Following the establishment of the Rural Water Supply and Sanitation Agency (RUWASA) in 2019, a new management information system has been created, the Rural Service Delivery Management Information System (RSDMS), which aims to support performance monitoring for both operation and governance needs of rural water investments. RSDMS is used to monitor and regulate construction and service provision for rural water and sanitation at district, regional and national levels.

Workshop Outcomes

Participants stated that accurate and up-to-date data on water point location, functionality, water quality, water tariffs, number of water users and presence of an active community-based water supply organizations (CBWSOs) were needed to allow for monitoring water point functionality.

Participants identified the following key challenges for implementing effective WPM: 1) collecting information from remote water points; 2) ensuring information is up to date; 3) handling issues with accurately coding unique water points; 4) handling challenges around data quality related to paper-based collection; and 5) holding data providers accountable.

Potential approaches to help overcome some of the identified WPM challenges included: 1) building technical capacity of CBWSOs; 2) engaging water users to provide information on water point functionality; 3) investing in hard-to-reach areas; and 4) advancing technological innovation to update the data collection systems.

Workshop participants were keen to learn more about WPdx and how the platform could be useful in compiling data to create a more up-to-date understanding of the rural water landscape and provide evidence-based insights to inform the decision-making process.

Sierra Leone Data Use Impact Desktop Study

Map image from Tonkolili showing investments made and WPdx recommendations.

A desktop study on the potential impact of data use in Sierra Leone was completed by Global Water Challenge for Akvo in support of the Data to Decisions program. A brief summary of the study is shared below, and the full report can be viewed here.

Objective
A common hypothesis is that using evidence to inform decisions regarding placement and repair of water points will lead to more impactful investments compared with traditional methods which rely heavily on political pressures and assumptions. The objective of this desktop study is to determine how many additional people could have theoretically received water services (defined as access to a functional water point within a 1km radius) if decisions about water point investments used evidence-based decision-support tools rather than traditional approaches in Sierra Leone.

Approach
This study focused on analyzing the number of people reached with water point investments made during 2012 in 12 districts in Sierra Leone in comparison with the number who might have been reached if the investment decisions had been driven by evidence. Twelve districts were included in the analysis: Bombali, Bo, Bonthe, Kailahun, Kambia, Kenema, Koinadugu, Kono, Moyamba, Port Loko, Pujehun, and Tonkolili.

Data from water point investments made prior to 2012 were downloaded from the Water Point Data Exchange (WPdx) database to provide a baseline of the data which could have been used to inform decisions made in 2012. Data from investments made in 2012 were also downloaded for the 12 districts from the WPdx database. The majority of data was provided by the Ministry of Water Resources, with additional contributions from non-governmental organizations (NGOs) working in Sierra Leone.

Using the evaluation and repair priority methods, the number of people reached by water point installations and rehabilitations from twelve districts in Sierra Leone were analyzed and compared with the number of people which could have been reached based on recommendations from the (first generation) WPdx decision-support tools. The 2012 Sierra Leone dataset did not clearly differentiate between rehabilitations and new constructions.

 Key Findings

From the available data, there were 1,561 water investments made in 2012, which reached a total of 28,556 people. Had WPdx data been available and used for making decisions on water investments in 2012, it would have been possible to reach nearly four times as many people with only about a third of the cost in water point investments. WPdx recommendations included 430 water point rehabilitation reaching 109,043 people. This is equivalent to a reduction in costs per-person reached from $54.66 to $3.94.

 The full study is available here.

Acknowledgements

With appreciation to Angela Cotugno for her assistance with the GIS analysis and Daniel Siegel for his contributions to the development of the first generation of WPdx geospatial decision support tools.

*Please note the study utilized the first generation of WPdx decision tools which have since been updated, though utilize similar algorithms.

Building Capacity and Improving Decisions with YouthMappers

Photo courtesy of Project team/Gulu YouthMappers

With contributions from Stella Nacakwa, M.S. Candidate, West Virginia University & Courtney Clark, YouthMappers.

The YouthMappers Chapter at Gulu University, in partnership with YouthMappers, West Virginia University, the Water Point Data Exchange (WPdx) and the U.S. Agency for International Development (USAID)recently launched the Uganda Water Infrastructure Mapping Project (U-WIMP). The U-WIMP project has two complementary goals including:

  • building the technical capacity of Ugandan YouthMappers in both mapping and spatial decision-support analysis, and
  • supporting the Gulu District Water office in making evidence-based decisions through the collection of digitized water data and the application of cutting-edge data analytics.

Water point data collection in Gulu has historically been paper-based, making it challenging to compile, analyze, share, and use the data to inform decisions. The YouthMappers U-WIMP project will provide much-needed data updates as well as pilot an approach to digitize drinking water resource monitoring. Dr. Denis Nono, a Gulu University lecturer specializing in water resources, and advisor to the project, applauded the engagement of YouthMappers students as this will provide a significant resource to the Gulu Water District Office as well as a hands-on learning opportunity.

Data collected through the project will be uploaded and harmonized with existing records on the Water Point Data Exchange (WPdx). WPdx is an online platform for sharing, accessing, and using water point data that currently hosts over 600,000 records from over 50 countries. The WPdx database includes 557 water point records in Gulu District, reported between 2010-2022. WPdx hosts a suite of decision-support tools designed to provide decision-makers with information to optimize limited resources by prioritizing locations for investments including preventative maintenance, rehabilitation, and new construction, and providing updated estimates on drinking water service coverage in rural areas for local communities, host country governments, and watershed authorities. Results from the decision-support tools are visualized on interactive web-maps and summary graphs which can be utilized by a range of audiences. Additionally, the collected data will be utilized in combination with high-resolution satellite imagery to build models which can automatically detect water points.

Following data collection and analysis, the team will host a series of workshops with government, NGO and academic stakeholders to share the findings and explore how the information can be used to inform decision-making processes and work planning in the district.

The project activities will build the chapter student members’ technical knowledge of the water sector and serve to connect these youth with their communities. The project also emphasizes gender considerations through the YouthMappers Everywhere She Maps campaign which seeks to improve the availability of geographic datasets for women’s economic empowerment. 

This project will also raise global awareness among YouthMappers Chapters of the importance of evidence-based decision making and effectively managing water points and water resource allocation, and it will build on WPdx’s current work to improve decisions which ultimately lead to an increase in sustainable access to water services.  

Project Objectives

  1. Build capacity of the Gulu University YouthMappers chapter through introduction to rural water challenges and practical experience in collecting field data.  
  2. Pilot deployment of a team of YouthMappers students utilizing open-source software to collect up-to-date digitized water point data to provide a better understanding of current water access and create a more updated dataset for analysis through the WPdx platform.
  3. Partner with district, regional, and national government authorities to integrate findings from decision-support tools into budgeting and planning decisions. 
  4. Explore the feasibility of using satellite imagery to identify existing water points, both manually and through AI/ML predictive models to help build a more comprehensive inventory which can then be used to help prioritize investments in water point monitoring, preventative maintenance, and rehabilitation to support sustainable service delivery.
Photo courtesy of Project team/Gulu YouthMappers

Project Timeline

An initial set of in-person trainings for the University of Gulu YouthMappers student chapter was held in December 2021 and January 2022 including introductions to rural water, OpenStreetMap, WPdx and a practical session on water point data collection. Virtual trainings have continued throughout early 2022 focused on the application of OpenStreetMap to map and visualize features. Additional details and updates on progress on the project can be found on the OpenStreetMap wiki platform.

Next Steps

A comprehensive data collection effort for two sub-counties in Gulu district is scheduled for early summer 2022. Collected data will be cleaned and uploaded to both OSM and WPdx. Collected data will also be utilized to build a training data set using high-resolution satellite imagery for the development of models which can automatically detect water point locations. Decision-support analyses to estimate current coverage levels, and priority areas for rehabilitation, new construction and preventative maintenance will be conducted through the WPdx platform. The results of the data collection and combined analyses will be shared with government and NGO stakeholders during a collaborative workshop.

 

Celebrating Open Data Day 2022: The Power of Rural Water Point Data for Evidence-Based Decisions

WPdx is excited to continue to promote transparent data sharing and use of open data in the rural water sector through our second annual Open Data Day celebration!

Launch of Decision Support Tools Web App

To celebrate Open Data Day 2022, we are pleased to announce the launch of the new beta WPdx Decision Support Tools web app.

The WPdx Decision Support Tools interactive web app allows users to view and explore available water point data and analytical results from the WPdx+ dataset and suite of decision-support tools.

For more information on the launch of the tools, please visit our related blog post and/or see the links below which provide detailed information on each of the available tools. 

  1. Administrative Region Analysis
  2. Rehabilitation Priority Analysis
  3. New Construction Priority Analysis
  4. Data Staleness Analysis
  5. Functional Status Prediction Analysis (coming soon)

Recognition of Leaders in Data Sharing

Over the past year over 50,000 new water point records have been uploaded to the WPdx platform from 14 different organizations. We want to take this opportunity recognize and celebrate the following entities that have demonstrated their commitment to
transparency and accountability by sharing data with the WPdx platform.

Interested in sharing data with WPdx? Please see here for more details or contact info@waterpointdata.org with questions.

Direct Contributors to WPdx

  • Inter Aide
  • IRC WASH
  • Uganda Water Project
  • USAID Lowland WASH Activity, implemented by DT Global
  • Village Water
  • Water4
  • Water & Sanitation for the Urban Poor (WSUP)
  • Water For People
  • World Serve International
  • YouthMappers

Contributors via Open Data Portals 

Africa GeoPortal

  • Grid3

Humanitarian Data Exchange (HDX)

  • iMMAP
  • United Nations Office for the Coordination of Humanitarian Affairs
  • REACH Initiative

Transforming Data to Action

Over the past year, WPdx has continued to work with government and NGO partners in Ethiopia, Ghana, Uganda and Sierra Leone.

During 2022 we continue to work to integrate the results from the WPdx decision support tools to strengthen existing decision-making processes.

Thank you to our generous funders and key partners:

 

 

Thank you to the entities which have shared data with WPdx in the past year:

 

Introducing WPdx+

The Water Point Data Exchange (WPdx) is pleased to announce the launch of a new analysis-ready dataset, Water Point Data Exchange Plus (WPdx+). WPdx+ is further enhanced and refined version of the original WPdx dataset, now known as WPdx-Basic.

Through the online data playgrounds, all users can:

  • Sort and filter data
  • Create custom own sub-datasets based on location or other parameter of interest
  • Visualize data using charts, graphs and simple maps
  • Download/export data

The WPdx+ dataset is focused on a subset of countries for which WPdx has enough data for the decision support tools to be activated. The tools and dataset enhancements can be made available for any country if a representative dataset can be shared with WPdx.

The WPdx+ dataset is the input for the new suite of decision-support tools which are under development. Please check out the recently released updated Rehabilitation Priority tool. Additional updates to the remaining tools will be released in the coming months.

For more information on how to add a new country to WPdx+, email info@waterpointdata.org with “New Country Interest” in the subject line.

Please see below for a brief summary of the two datasets:

  • All data shared to WPdx is included in WPdx-Basic Global Data Repository.
  • There are five validation/data-cleaning steps which occur during the ingestion process:
    • Ensure that all records contain the required parameters. Records
      that do not contain required parameters are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Check to ensure that points are located within a country boundary
      per GADM boundaries. Points which do not fall within country boundaries (i.e., in the middle of the ocean) are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Formatting of entries for consistency. For example, for the Presence
      of Water When Assessed (#status_id) parameter, the repository will shown as “Yes” or “No”.
    • Addition of ‘clean’ version of country name, #adm1, #adm2, #adm3
      based on provided GPS coordinates and GADM boundaries. ‘Clean’ values are appended to the record, leaving all original data intact.
    • Addition of “water_source_clean”, “water_tech_clean” and “management_clean” columns. These new columns are created using fuzzy matching to organize entries into consistent categories. For more information on the cleaning process, please see here.
  • Focused dataset on countries for which WPdx has enough data for decision-support tools to be activated.
  • Further data processing steps including:
    • Removal of records identified as having location mismatches (i.e., data provided states that record is from Country X, but GPS location is in Country Y)
    • De-duplication for any records mistakenly uploaded twice (exact matches only).
    • Assignment of a WPdx_id which matches water point records shared by different organizations and on different dates, based on GPS location.
  • Addition of external relevant data sources which are used in the water point status predictions models, including:
    • Distance between water point and nearest road (primary, seconday and tertiary), town and city using OpenStreetMap data.
    • Additional external data coming soon!
  • Tabular access to results from advance decision support tools:
    • Rehabilitation Priority
      • Which non-functional water point should be prioritized for repair?
      • Population living within 1km of water point
      • Likely current and potential users
      • Crucialness of water point (are there alternate working points nearby?)
      • Pressure on water point (is the point over or under utilized?)
    • Water Point Status Predictions – which water points are at a higher risk of failure? (coming soon)
    • Construction Priority – which locations should be considered for new construction to reach unserved populations? (coming soon)
    • Measure water access by administrative division – how does coverage vary within different sub-national divisions? (coming soon)
  • Inclusion of additional key external parameters (coming soon)

Share your data with WPDx.. in 30 minutes or less!

Sharing data with WPDx has never been easier. In fall 2020, WPDx completed a major overhaul of our ingestion engine to streamline the process for data sharing. This blog will take you step-by-step through the upload process. In most cases, this will take less than 30 minutes to complete! If you have questions, please reach out to info@waterpointdata.org.

Before you start, please review our Data Submission Policy to ensure that you have the correct permissions to share the data.

The first step is to review the WPDx data standard and compare with your organization’s dataset. The ingestion notes file can help you document how to map your data to the standard which will save you time later in the process.

To upload data, the minimum requirements are for the dataset to include location (latitude, longitude in decimal degrees), presence of water when assessed (functional status), date of data inventory, data source (organization providing the data), and information on either/both the source and technology of the water point. While these are the minimum requirements, we highly encourage organizations to share as many parameters as possible to provide a more complete entry. These additional parameters, such as install year or management are utilized in the predict water point status tool.

Accessing the ingestion engine

Once you know which columns from your dataset you want to share, you are ready to start the upload process. Go to http://upload.waterpointdata.org to access the WPDx ingestion engine.

  • Click on “Login to the System.”
  • Please note, the ingestion engine requires a Google account.

After login, you’ll arrive at the ingestion engine dashboard:

Sharing your data file

There are two options for uploading data:

  • Upload a physical file (.xlsx, .xls, .csv) from your computer
  • Provide a web link to an API endpoint, Google Sheet, Dropbox or other online system

To upload a physical file:

Before you upload the file, please rename the file using the following format:

  • Organization Name_Countries included_Month Year of Data included
  • For example, Global Water Challenge_Uganda_Jan2020

Select the “Source Data” tab

  • Select “+ Upload Data File”
  • Click on “Select File”, browse to your organization’s data file and click “Open”
  • “File Upload Successful” message will appear at top of screen

Share data via weblink

To upload from a weblink, you must  provide a weblink with permissions. You will enter the weblink on the Data Import Workbench page after first providing some basic information about your dataset.

  • For Akvo Flow, request an API endpoint from your program manager. The API endpoint will be used in the direct URL box at the beginning of a processing task. For more details, please see here.
  • For mWater, create a datagrid formatted per the WPDx standard. This creates a permanent URL. Click on “Download as XLSX” and copy the download link. Use this in the direct URL box at the beginning of a new processing task. For more details, please see here.
  • For Dropbox, copy the download link (not the sharing link) to use in the WPDx ingestion engine. Use this link in the direct URL box at the beginning of a new processing task. Select the appropriate format from the dropdown.
  • For Google Sheets, ensure that the document is shared publicly (select “Anyone on the internet with this link can edit” from the share settings). Enter the URL for the Google Sheet in the direct URL box at the beginning of anew processing task. Be sure to select Google spreadsheet from the format dropdown. 
  • For custom data platforms, please contact us to determine how we can best connect.

Start New Processing Task

Select “Processing Tasks” tab

Select “+ New Processing Task”

Task Name and Description

Enter the Task Name in the following format:

OrgName_Country/Region_Month/Year of data

For example, Global Water Challenge_Global_2019

Provide the main purpose for the collected data under Description

Metadata

Complete the metadata prompts to provide a detailed overview of the data within your dataset.

The metadata will be visible on the data page for your dataset within the WPDx data catalog.

Point of Contact

Complete Point of Contact details for dataset.

To protect privacy, one option is to use an organizational level email (i.e., data@name.org) which can be forwarded by your organization to relevant contacts.

Agree to Data Sharing Terms

Check box to agree to Data Sharing Terms

Leave visibility as “Only Visible to Me”

Select: Save & go to Workbench

Data Import Workbench

Select your source file from the dropdown.

Allow data to process (this may take a few minutes). The Direct URL and format boxes will auto-populate.

If there are multiple sheets in your file, make sure the correct one is selected.

Scroll down to continue (the “Data is Processing” message may still appear)

If using a web address, enter directly in Direct URL text box and select the appropriate format option.

For JSON formats, be sure to leave the JSON Path field blank.

Data Structure

If your dataset is formatted to include only the column headers and the data, leave Skip Rows/Columns as “0”

If there are additional rows or columns which should be skipped (i.e., additional headers or title cells) enter the number of rows/columns to skip.

For the sample data shown below, you would enter “2” in Skip Rows. Leave Skip Columns at “0”

Ignored Values

If your dataset includes terms for blank/unknown values which should be ignored (i.e., Unknown, N/A, etc.), please enter those terms in the text box.

Use a comma as a separator between terms. Do NOT include any blank spaces between commas and terms.

For example: “unknown,Unknown,N/A,0,null,blank”

Data Mapping: Getting Started

There are two methods to complete the data mapping process:

Primary method..

  • Using the dropdown menu, scroll to select the column header from your dataset which matches the WPDx standard.
  • Some parameters may pre-populate, especially if your dataset is labeled with the WPDx #titles. Verify these selections.
  • Note: you cannot map the same column to two different standard parameters.

Optional method..

If there is a parameter which is not in your dataset, but for which a common value can be applied to all datapoints, Select “Constant…” from the dropdown.

  • Examples
  • #source – Data Source –> Constant: Name of Org
  • #country_id – Country –> Constant: “UG” or “GH”
  • #orig_lnk – Public Data Source URL –> Constant: URL

Data Mapping: Required Fields

There are 6 mandatory parameters:

  • #lat_deg – Latitude
  • #lon_deg – Longitude
  • #status_id – Presence of Water when Assessed
  • #report_date – Date of Data Inventory
  • #source – Organization providing data
  • #water_source – Water Source AND/OR
  • #water_tech – Water Point Technology

Data Mapping: #lat_deg and #lon_deg

Latitude and longitude must be in decimal degrees in WGS84.

Select the appropriate column header which matches with #lat_deg.

Go the next dropdown and make the selection to match #lon_deg

Data Mapping: #status_id

Select the appropriate column header from the dropdown

Default values include Yes/No. “Unknown” values (see slide 14) will be converted to a blank cell in the WPDx Global Data Repository

If your dataset does not include Yes/No, but instead terms such as “Functional/Partial/Non-functional” select “more settings..” and enter those terms.

True Values = terms which indicate the water point IS functional

False Values = terms which indicate the water point is NOT functional

Do not leave any spaces between terms, just a comma (i.e., Yes,functional)

Data Mapping: #report_date

Select the appropriate column header from the dropdown

The system will automatically detect the format of the dates in your dataset

If there are errors indicated, select “more settings…” and choose a specific format. (This should only be an issue in rare circumstance)

Data Mapping: #source

Provide the name of the organization providing the data.

If your dataset includes data from multiple sources, please map the parameter to the appropriate column header that lists each organization.

Otherwise, the entry for Data Source in the About the Data section will be applied to all uploaded records.

Data Mapping: #water_source & #water_tech

At least one of #water_source or #water_tech must be mapped for the upload to proceed.

Select the appropriate column header/s from the dropdown

If the information is constant for all values, you can instead select “Constant.. “ and enter in the appropriate value in the text box.

Data Mapping: Optional Fields

The “Optional Fields” are not required, but they do help to provide a more robust dataset for understanding the status of the local water sector.

Please map as many of the WPDx parameters as possible.

For any parameters which do not align with your dataset, you can select “No value for this field” (this is the default selection) and go on to the next parameter.

For example, if your dataset does not include any information on payment:

 

Data Mapping: #country_id

Select the ISO two letter country classification code, selected from a list of all ISO country codes.

If your dataset includes entries from different countries, this information should be included in your data file. Select the appropriate column header from the dropdown menu.

If your dataset only includes entries from a single file, you can select “Constant..” and enter a value to be applied to all rows.

Data Mapping: #adm1, #adm2, #adm3

#adm1, #adm2, and #adm3 are official administrative division designations

If you have questions, look at GADM.org (see tutorial on next slides) or statoids.com to determine the appropriate designations.

GADM.org: Check administrative divisions

1. Go to GADM.org and Select “Maps”

2. Click on country of interest

3. Select “Show sub-divisions”

4. This creates a map and a list of first-level subdivisions

5. Click on one of the first level sub-divisions

6. Click on “Show sub-divisions

7. This creates a map and list of second level subdivisions

Data Mapping: #activity_id

Select the appropriate column header from the dropdown

If a locally or globally recognized standardized identification number exists (i.e., a physical well ID number of barcode) within your dataset, please use that column

OR

If your organization has a unique id system which would allow water points to be matched within your organization over time, please use that column

 

Data Mapping: #scheme_id

Select appropriate column header from dropdown

Data Mapping: #install_year

Select the appropriate column header from the dropdown.

Note that this field accepts a four-digit year or a full installation date. Only the year will be extracted from full date entries.

Data Mapping: #installer

Select appropriate header from dropdown.

Data Mapping: #rehab_year

Select the appropriate column header from the dropdown.

Note that this field accepts a four-digit year or a full installation date. Only the year will be extracted from full date entries.

Data Mapping: #rehabilitator

Select appropriate header from dropdown.

Data Mapping: #management

Select appropriate column header from dropdown.

Select the management classification of the entity that directly manages the water point. Example management types include:

  • Direct Government Operation
  • Private Operator/Delegated Management
  • Community Management
  • School
  • Healthcare Facility
  • Other Institutional Management
  • Other

Data Mapping: #pay

Select appropriate column header from dropdown.

Data Mapping: #status

Select appropriate header from dropdown.

Please note that the system can not map the same column to two different WPDx parameters. If you would like to use the same column, please duplicate it in your dataset (and change one of the column headers). For example, it may be useful to use the a duplicated version of your functionality column for both #status_id and #status.

Data Mapping: #orig_lnk

If the data is available via a public link, select ‘Constant’ from the dropdown and enter it so that it can be applied to all rows.

If there is to a public link, leave as ‘No value for this field’

Data Mapping: #photo_lnk

Select appropriate column header from dropdown.

If there is to a public link, leave as ‘No value for this field’

Data Mapping: #fecal_coliform_presence

Select appropriate column header from the dropdown

Default values include Present/Presence and Absent/Absence. If your dataset include other terms, select ‘more settings…’ and enter the terms into the True Value and False Value text boxes.

Separate terms with a comma but do not include any spaces.

Complete associated metadata questions at the bottom of the page (see Water Quality Metadata section for more information).

 

Data Mapping: #fecal_coliform_value

Select appropriate column header from dropdown

Complete associated metadata questions

Data Mapping: #subjective_quality

Select appropriate column header from dropdown

Complete associated metadata questions

Data Mapping: #notes

Select appropriate column from header or apply Constant value is appropriate.

The #notes parameter can be used to enter custom data which the host country government or organization has selected.

For example, some organizations want to track seasonality, additional administrative districts, or some combination.

Multiple parameters can be included by creating a column that includes the parameters of interest, separated by a “;” or “…” delimeter.

Water Quality and Notes Metadata

If you mapped the #fecal_coliform_presence, #fecal_coliform_value or #notes columns, please complete the additional metadata question section.

Once mapping is complete

Select “Save” or “Save and Submit for Approval”

Select Save and Submit for Approval when your data has been fully mapped and is ready for upload

The status in the Processing Tasks tab will now show as “Pending”

An administrator will be notified and will complete the uploading process

Once approved, an email will be sent to the uploader’s email address

If the mapping was not successful, you will see an error message indicating which parameter was not mapped and explanation of why. Once the error has been fixed, you can submit the processing task for approval.

Successful Upload!

Once the data upload has been completed by an administrator, the status in the Processing Task will be marked as “Success”. An auto-generated email will also be sent to the account email address. 

You can view an overview of the dataset in the WPDx data catalog by clicking on the eye icon.

The data catalog dataset page includes:

  • Metadata and contact details
  • Ingestion report – summary statistics of the number of rows uploaded and any errors encountered
  • Link to download source file

Data will be visible on the WPDx data repository within 24 hours.

Need to make changes?

Users can edit their datasets and processing tasks to correct errors or make other additions (i.e., add a new column that was not previously mapped).

To remove data from WPDx, please contact the administrator at info@waterpointdata.org with “Request to remove data from WPDx” in the subject headline. Include the name of the source file and the reason for the removal request.

Source Data: Update Contents or Delete

If you realize you have made an error and/or need to edit or amend an existing dataset, go to the Source Data tab, select ‘Update Contents’ and upload a revised file.

Once the file has been updated, go back to the associated Processing Task and check/edit the Processing Task content and data mapping and hit “Save and Submit” at the bottom of the Data Import Workbench page.

Do not use ‘Update Contents’ to initiate a new dataset upload as this will replace any previously shared data. Instead upload a new file and start a new Processing Task.

Editing a Processing Task

If you want to add/edit the metadata for your dataset and/or make changes to the way that the data is mapped to the standard, select “Edit” from the Processing Task tab.

Make any changes and hit “Save and Submit” at the bottom of the Data Import Workbench page.

An admin will be alerted of your update and will review and process the upload.

Questions?

Please contact: info@waterpointdata.org

Check out our Resources and FAQs

Increasing water point data sharing for evidence-based decision-making in Ethiopia: The start of a journey

Guest Blog by Laura Brunson, Ph.D., Deputy Director, Millennium Water Alliance

Ethiopia is home to 112 million people with more than 81 million living in rural areas. According to a 2017 Joint Monitoring Program report, in rural areas of Ethiopia, only 4% have access to safely managed water services, 30% have basic water service, and 26% have only limited water service, with the rest consuming water from unimproved or surface water sources. Many households spend 30 minutes or longer to obtain drinking water daily. Despite Ethiopia’s achievement of the 2015 Millennium Development Goal on water, there are several unaddressed challenges that hinder safe and sustainable water service delivery for millions of rural Ethiopians.

The One WASH National Program (OWNP) of Ethiopia is a flagship program that has enabled the WASH sector to collaborate more broadly and achieve substantial progresses since its launching in 2013. The evaluation of the first phase of the OWNP (2013-2017) commissioned by Ministry of Water, Irrigation and Electricity (MoWIE) revealed a set of bottlenecks encountered by the water supply sector in Ethiopia. These include: lack of an independent regulatory entity, inadequate involvement and resource for the private sector including microfinance institutions, absence of harmonization between water inventory and other sector data, and the absence of an operational Management Information System (MIS) for data.

Challenges with Data Management

Cognizant of the gap in water point data availability and utilization, a series of workshops conducted through the MWA partners in 2018-2019 in the Amhara Region provided clarity on the type and extent of challenges faced by WASH sector partners due to this data challenge. Some of the major problems associated with the lack of up-to-date water point data included: inability on the side of water actors to make evidence-based decisions on operation and maintenance, increased non-functionality rate of water points, and difficulties with financial resource allocation. Rural water point data is housed in a multitude of formats and places with woreda (district) governments, national government, and NGOs which all have their own sets of data for particular geographic areas that are often outdated.

MWA and its members and partners recognize the urgent need to strengthen the government’s capacity in water supply data management, analysis, and evidence-based decision making. Strengthening the government-led monitoring system is one of the priorities identified for MWA and partners in the on-going Sustainable WASH Program, 2019-2024, funded by the Conrad N. Hilton Foundation. This program operates in the Dera, Farta and North Mecha woredas; MWA members in Ethiopia, via other programs and funders, are working on water in more than 100 woredas.

A Platform for Data Sharing, Access and Analysis

WPDx is a free and open-source platform that serves as a repository for rural water point data and provides decision-making tools to support governments and other stakeholders. Over time the WPDx Working Group developed a short list of standard parameters that are required in any data upload, alongside a much larger list of optional parameters which support a more robust data set. Recent updates to WPDx have resulted in a fast and easy method by which data can be shared and is collection tool agnostic (e.g. data collected in ODK, mWater, Akvo and other forms can all be used). WPDx provides maps and easy ways to search for and access data by organization, location, and other parameters. One of the strengths of WPDx is its ability to compile data from multiple entities or platforms (e.g. government and NGO data from one district) and make all of it available for use. WPDx provides four decision-support tools, three using geospatial analytics and one using machine learning to make status predictions. The three geospatial tools include: assessment of basic water access by district, prioritized locations for implementation of new water points, and prioritized locations where repairs are most impactful. The predictive tool provides insights on the probability that any given water point will be functioning or not.

WPDx can serve as a helpful tool for governments and NGOs as many are seeking to improve their data collection, use and analysis. With quick and easy data uploading, dispersed and fragmented data sets can be combined for easy visualization and then free access to the analysis and decision-making tools. The Government of Sierra Leone has already incorporated WPDx as one of its focus tools and has mandated that all rural water point data collected across the country must be shared into the WPDx platform. The national Ministry of Water Resources has a national directive in place to require the use of WPDx decision support tools in all investment decisions for rural water services. You can see more about the role of WPDx in Sierra Leone here.

A New Partnership

In 2020 MWA and the Global Water Challenge (GWC) developed a partnership with the goal to provide support in response to these identified rural water point data challenges in Ethiopia. As a starting point, MWA had a series of discussions with the Water Development Commission (WDC), which sits within MoWIE, about their challenges and whether or not WPDx could be a useful tool to help strengthen rural water service delivery, functionality and informed decision-making in line with the Sustainable Development Goal target 6.1. The WDC expressed interest in using WPDx and getting support from MWA, its members, and other stakeholders to use WPDx in Ethiopia.

Several resulting actions took place:

  • WDC issued a formal letter requesting NGOs to share rural water point data to WPDx
  • WDC issued a formal letter of support to GWC to collaborate to implement WPDx
  • WDC assigned a focal WPDx person to support these efforts.

Building on this expressed interest and formal requests from the Ministry, MWA provided a series of training opportunities for NGOs and government partners to learn more about WPDx. This first set of trainings included:

  1. Purpose and value-add of WPDx in the water sector
  2. How to easily upload data using the new WPDx ingestion engine
  3. Example from Sierra Leone showing use of WPDx by national government
  4. Noting that MoWIE, via WDC, has approved partnership with WPDx and encourages organizations to share their data to WPDx.

A second training was developed to provide information on the decision-making tools. This training included:

  1. Overview of the available decision support tools
  2. How to use the tools
  3. Discussion on how the decision support tools can be useful to regional and woreda governments in prioritizing locations for new water points or rehabilitation and for monitoring basic service delivery levels.

Training sessions were guided by presentations delivered by MWA with support from GWC and then followed by interactive question and reflection sessions by the participants. Participants were encouraged to share data during the trainings to practice data uploading and then return to their home organizations or offices and share larger recent data sets to the WPDx platform.

Select Lessons Learned from the Training Process

  • The introduction of the WPDx initiative to NGO partners and their willingness to engage in trainings and upload data has demonstrated the desire and readiness of WASH stakeholders in Ethiopia for a robust rural water point data platform for planning and decision-making. The importance of monitoring data for the WASH sector has been highlighted in several workshops convened by the government for some years. Nevertheless, an open-source platform like WPDx that can be used by each and every WASH stakeholder has never been planned or practiced in Ethiopia. The reflection from the trained focal persons about its accessibility and ease of application has indicated potential for continued use and ability to add value.
  • The WPDx initiative and the training provided motivated NGO partners to start compiling the most recent data they have and updating existing water point data. On average, most organizations required about a month to obtain necessary permissions to share data, from government or headquarter offices, clean their data, and upload it to WPDx.
  • The reflection and expressions of demand from the Water Development Commission of Ethiopia implies the potential for WPDx to be aligned with the National MIS for water supply.

Just within a three-month period following the trainings, NGOs and government entities have uploaded more than 20,000 data points.

While this is great progress, particularly during a time of pandemic and political unrest, this is only the beginning. The MWA, GWC, and WDC partnership continues with next steps including another training series for more organizations and government entities, support for a critical mass of uploads in target districts to support use of the analysis and decision-making tools and then development of a case study, and support to increase the geographic scope in which WPDx data is shared within the country. Stay tuned for future updates on the uptake and use of WPDx by National and local government in Ethiopia.

 

Funding for the work discussed in the post was generously provided by the DT Institute and the Conrad N. Hilton Foundation.

Photos: Credit to Tedla Mulatu Millennium Water Alliance. Photos show government and NGO partners engaged in WPDx training sessions.

 

The Millennium Water Alliance is a permanent global alliance of leading humanitarian and private organizations that convenes opportunities and partnerships, accelerates learning and effective models, and influences the WASH space by leveraging the expertise and reach of its members and partners to scale quality, sustained WASH services. MWA’s 20 members work in more than 90 countries around the world. MWA serves as a hub for major programs in Kenya and Ethiopia.

The Water Point Data Exchange is an initiative of the Global Water Challenge (GWC). GWC is a coalition of leading organizations committed to achieving universal access to safe water, sanitation, and hygiene (WASH) and women’s empowerment. With companies, civil society partners, and governments, GWC accelerates the delivery of safe water and sanitation and supports gender equality through partnerships that catalyze financial support and drive innovation for sustainable solutions.

How to use machine learning to predict water point status

Guest Blog by Lars Heemskerk, Consultant for Akvo

< The water point you selected is probably no longer functional > 

If you’re responsible for providing drinking water to as many people as possible, this is the kind of information you want to have access to – especially when you’re hundreds of miles from the water point in question. Thanks to the support of the Dutch Ministry of Foreign Affairs and the Coca-Cola Foundation (TCCF) Akvo, together with WPDx and DataRobot, was able to conduct a pilot in Sierra Leone with machine learning algorithms to automate decision intelligence.

Improving water services in Sierra Leone

 As of 2012, the government of Sierra Leone has been monitoring water points through a large-scale national inventory, as well as small-scale monitoring efforts by NGOs. Data has been collected on the functionality, year of construction, type of pump, type of management, distance to village, etc. to calculate the percentage of the population that have access to drinking water. This data provides a global insight into the state of WASH infrastructure in the country and, because Sierra Leone is at the forefront of African countries sharing data openly, a lot of this data is available on platforms like WASH data Sierra Leone and WPDx.

 Unfortunately, this data is not regularly enriched, so the information on these portals is quickly outdated and therefore less reliable. Thanks to various efforts from WPDx, among others, the importance of regular uploading of data has been emphasised in the National Digital Monitoring Approach. The recent signing of a letter by the director of the Water Directorate, which states the mandatory sharing of water point data by every organisation or government body in Sierra Leone, is an indispensable step in this process.

 In addition, Akvo, in collaboration with WPDx and the Ministry of Water Resources, has started to explore how more can be done with the existing data, at local and national level, to generate data-driven insights that can improve decision making. Machine learning is relatively new in the water sector, but can be applied very well to historical data to predict outcomes and uncover patterns not easily spotted by humans.

Setting up the foundation for advanced analytics

Machine Learning is about recognising patterns in data. Using data collected in the past, machine learning techniques can recognise patterns and make predictions for the future. This can be applied to historical water point data, too. 

Based on the available data, and with the help of DataRobot software, we have been able to determine a number of indicators that are related to the predictable metric – functionality. By combining functionality with other indicators, such as district, county, management, age, water source, and type, the system can teach itself to predict the probability that a water point will be functional now or in the future. The tool is made available on the Water Point Data Exchange.

By using the DataRobot platform, we were able to predict which water points are going to break with an accuracy of 85%. By applying these machine learning models, it’s possible to determine which broken water point, out of thousands, should be fixed first to help the most people. On top of this tool, decision makers can also make use of other geospatial information services (GIS) tools that have been developed to analyse water points to determine high impact locations for rehabilitation, construction and estimating basic water coverage aligned with the Sustainable Development Goals (SDGs).

Pilot training and support 

Implementing these new advanced analytics techniques, it is just as important to involve and train stakeholders. This is not an easy process because it involves major process changes and the involvement of various governmental and non-governmental organisations. In 2019, the Global Water Challenge already held a three day training session with all district water directorates to discuss the transformation of the WASH sector to improve efficiency through the use of data. Following this session, a meeting was held to brief NGOs on the WPDx approach. Building on this general training, more focused training was provided to district mapping officers and NGOs. The next step was to set up a plan on how to use and implement the decision support tool. At the moment of writing this blog, a draft plan has been created and a workshop has been organised to dig deeper into how the decision support tools can contribute to safe water for all in Sierra Leone.

The need for more accurate data

Beside the involvement of NGOs and government bodies, reliable and up-to-date data is crucial for making correct predictions. Since the last national inventory dates back to 2016, it’s important that the water points are structurally monitored. With the letter from the above mentioned Water Directory, there will be a boost of more recent data which will certainly have a positive effect.

We also encourage stakeholders to test whether the machine learning predictions correspond to reality. This can be done on a small scale. There are talks with the Ministry of Water Resources and InterAide to carry this out and test whether the outcomes of the tools are correct and usable in the daily life of decision makers. We would like to continue with this in 2021, in order to prove the power of advanced analytics, but above all to provide drinking water to as many Sierra Leoneans as possible.