How-To Guide: Part Six
Data & Accessibility
Learn about ways to make water service area boundaries
easily accessible once developed.
Many state agencies make drinking water and other data available to the public through portals like California Open Data and Maryland's Open Data Portal. Including water service area boundary data via these data portals ensures accessibility and use. States should also ensure that the data is machine readable, interoperable (it can easily work with other systems, often via an API), and follows metadata standards. On a regular basis, the EPA team will review which states have made this data publicly available and add it to the national EPA database - thereby replacing their modeled boundary data with the more accurate and authoritative state data.
State Data Portals
National Repository
The Internet of Water (IoW) created a repository to manage the best available boundaries for public water systems (as identified but not spatially defined in the USEPA Safe Drinking Water Information System) in the United States with moderated community contributions. For now, the base layer will be the latest national dataset created by SimpleLab Inc, which is hosted on Hydroshare. Community contributions to IoW's Github repository are updated with additional attributes and synced to the layer in Hydroshare.
This repository includes state-based directories where community members may submit files associated with each individual contributed boundary. NOTE: This is a moderated repository. Any authorized user may open a pull request from a fork, and submit a file named to the appropriate state directory with the required metadata (PWSID, Name, the type of data that the boundary water generated from, source date, contribution date, etc.
Considerations for Accuracy
Given that this effort takes time, states recognized the value in publishing data as it is available and improving the accuracy over time (as opposed to waiting to publish until the data was 95% accurate), so long as the data included disclaimers about the data quality and provision of metadata. One state named that there will always be the last 10-15% of systems that do not respond, but even an incomplete data set is tremendously helpful. California suggests that having at least one pair of eyes regulating the output is necessary to be confident in a “good enough” dataset.