The Water Point Data Exchange (WPdx) is pleased to announce the launch of a new analysis-ready dataset, Water Point Data Exchange Plus (WPdx+). WPdx+ is further enhanced and refined version of the original WPdx dataset, now known as WPdx-Basic.

Through the online data playgrounds, all users can:

  • Sort and filter data
  • Create custom own sub-datasets based on location or other parameter of interest
  • Visualize data using charts, graphs and simple maps
  • Download/export data

The WPdx+ dataset is focused on a subset of countries for which WPdx has enough data for the decision support tools to be activated. The tools and dataset enhancements can be made available for any country if a representative dataset can be shared with WPdx.

The WPdx+ dataset is the input for the new suite of decision-support tools which are under development. Please check out the recently released updated Rehabilitation Priority tool. Additional updates to the remaining tools will be released in the coming months.

For more information on how to add a new country to WPdx+, email info@waterpointdata.org with “New Country Interest” in the subject line.

Please see below for a brief summary of the two datasets:

  • All data shared to WPdx is included in WPdx-Basic Global Data Repository.
  • There are five validation/data-cleaning steps which occur during the ingestion process:
    • Ensure that all records contain the required parameters. Records
      that do not contain required parameters are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Check to ensure that points are located within a country boundary
      per GADM boundaries. Points which do not fall within country boundaries (i.e., in the middle of the ocean) are not uploaded. A summary of records included in the upload can be found in the WPdx Data Catalog.
    • Formatting of entries for consistency. For example, for the Presence
      of Water When Assessed (#status_id) parameter, the repository will shown as “Yes” or “No”.
    • Addition of ‘clean’ version of country name, #adm1, #adm2, #adm3
      based on provided GPS coordinates and GADM boundaries. ‘Clean’ values are appended to the record, leaving all original data intact.
    • Addition of “water_source_clean”, “water_tech_clean” and “management_clean” columns. These new columns are created using fuzzy matching to organize entries into consistent categories. For more information on the cleaning process, please see here.
  • Focused dataset on countries for which WPdx has enough data for decision-support tools to be activated.
  • Further data processing steps including:
    • Removal of records identified as having location mismatches (i.e., data provided states that record is from Country X, but GPS location is in Country Y)
    • De-duplication for any records mistakenly uploaded twice (exact matches only).
    • Assignment of a WPdx_id which matches water point records shared by different organizations and on different dates, based on GPS location.
    • Addition of external relevant data sources which are used in the water point status predictions models, including:
  • Addition of external relevant data sources which are used in the water point status predictions models, including:
    • Distance between water point and nearest road (primary, seconday and tertiary), town and city using OpenStreetMap data.
    • Additional external data coming soon!
  • Tabular access to results from advance decision support tools:
    • Rehabilitation Priority
      • Which non-functional water point should be prioritized for repair?
      • Population living within 1km of water point
      • Likely current and potential users
      • Crucialness of water point (are there alternate working points nearby?)
      • Pressure on water point (is the point over or under utilized?)
    • Water Point Status Predictions – which water points are at a higher risk of failure? (coming soon)
    • Construction Priority – which locations should be considered for new construction to reach unserved populations? (coming soon)
    • Measure water access by administrative division – how does coverage vary within different sub-national divisions? (coming soon)
  • Inclusion of additional key external parameters (coming soon)