You are here

Proposal to Add Taxonomy for Water Tech and Water Source fields

Submitter:

  • Global Water Challenge

Proposal Supporters that have Shared Data:

  • Haiti Outreach – Brian Jensen
  • Global Environment & Technology Foundation – Brian Banks

Reason for Proposed Change:

  • The fields are currently open text, so a taxonomy would improve data quality. Initially, this taxonomy would be recommended. After a few years of its use and uptake, it could be required to further enhance the standard.

Proposed Change and Implementation Guidance:

  • Proposed Taxonomy Development
    • The recommended taxonomy has two categories, a broad category listed first (e.g. handpump or groundwater) followed by the specific technology name or water source (e.g. India Mark II or deep borehole).

#water_tech examples:

  • Powered Pumps – Solar Pump
  • Handpump – India Mark II

#water_source examples:

  • Surface water – Stream
  • Groundwater – Borehole

Evidence that the Data is Being Collected:

  • At least one of these fields is required for submission into WPDx so this data is always collected. In a recent analysis conducted by partners at Berkeley University using a fuzzy matching script, 82% of #water_tech data and 56% of #water_source values in WPDx would be mapped to taxonomies.

Impact on Data Usage:

  • Using a taxonomy for the #water_tech and #water_source fields will improve data quality as it will reduce the number of possible values that will be added for these two attributes. It will help reduce typos by recommending that data in these two fields are only from a well-defined set of values. Additionally, this taxonomy clarifies the distinction that WPDx makes between water technology and source. Currently, some WPDx datasets have a water source (e.g. river) in the water tech field so a taxonomy would reduce incidences of this. With a taxonomy for these fields, data can be analyzed more efficiently. For example, an analysis to compare technology types will be able to draw upon higher quality and accurate data.