Skip to content

Access Data

WPdx Global Data Repositories

WPdx maintains two datasets: WPdx-Basic and WPdx+. Both are fully free and open to explore and download. Please see tabs below for more details on each dataset. For both datasets users can:

  • Sort and filter data
  • Create custom sub-datasets based on location or other parameter of interest
  • Visualize data using charts, graphs and simple maps
  • Download data

Explore the full WPdx repository in the WPdx-Basic online data playground.

  • Sort and filter data
  • Create your own sub-dataset based on location or other parameter of interest
  • Visualize data using charts, graphs and simple maps
  • Download data

What is included in WPdx-Basic?

  • All data shared to WPdx is included in WPdx-Basic Global Data Repository. There are five validation/data-cleaning steps which occur during the ingestion process:
    • Ensure that all records contain the required parameters. Records
      that do not contain required parameters are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Check to ensure that points are located within a country boundary
      per GADM boundaries. Points which do not fall within country boundaries (i.e., in the middle of the ocean) are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Formatting of entries for consistency. For example, for the Presence
      of Water When Assessed (#status_id) parameter, the repository will shown as “Yes” or “No”.
    • Addition of ‘clean’ version of country name, #adm1, #adm2, #adm3
      based on provided GPS coordinates and GADM boundaries. ‘Clean’ values are appended to the record, leaving all original data intact.*
    • Addition of “water_source_clean”, “water_tech_clean” and “management_clean” columns. These new columns are created using fuzzy matching to organize entries into consistent categories. For more information on the cleaning process, please see here.*
    • Categorization of a facility_type into improved and unimproved based on JMP definitions.*

*Data cleaning and categorizing processes are inherently imperfect, but will be routinely reviewed, updated, and refined to ensure we are representing the shared data as accurately as possible. Please note that the cleaning process may result in some errors and discrepancies in categorizations which may impact the results of the WPdx Decision Support analyses. Suggestions for improvements and other feedback are most welcome!

 

Access and Download country data from WPdx-Basic

  • Select a country from the dropdown list to download (in CSV format) all available country data from the WPdx-Basic Global Data Repository.

Explore an enhanced subset of data in the WPdx+ online data playground.

  • Sort and filter data
  • Create your own sub-dataset based on location or other parameter of interest
  • Visualize data using charts, graphs and simple maps
  • Download data

The WPdx+ dataset is an enhanced version of the WPdx-Basic dataset for a subset of countries for which WPdx has enough data for the decision support tools to be activated. These enhancements can be completed for any country if a representative dataset can be shared with WPdx.

Enhancements include:

  • Further data processing steps including:
    • Removal of records identified as having location mismatches (i.e., data provided states that record is from Country X, but GPS location is in Country Y)
    • De-duplication for any records mistakenly uploaded twice (exact matches only).
    • Removal of records classified as ‘Unimproved’ and ‘No Facilities’ based on entries shared for #water_source and #water_tech.
    • Assignment of a WPdx_id which matches water point records shared by different organizations and on different dates, based on GPS location.
  • Addition of external relevant data sources which are used in the water point status predictions models, including:
    • Distance between water point and nearest road (primary, seconday and tertiary), town and city using OpenStreetMap data.
    • Additional external data coming soon!
  • Tabular access to results from WPdx advanced decision support tools:
    • Rehabilitation Priority
      • Which non-functional water point should be prioritized for repair?
      • Population living within 1km of water point
      • Likely current and potential users
      • Crucialness of water point (are there alternate working points nearby?)
      • Pressure on water point (is the point over or under utilized?)
    • Data Staleness
      • Days since report date
      • “Staleness” score calculated using a geometric decay model. More information on methods available here.
    • Water Point Status Predictions
      • Which water points are at a higher risk of failure? (coming soon)
  • Administrative Region Analysis
  • New Construction Priority

Latest News