Improving federal record-keeping for enhanced understanding of water systems in the US

Photo by Nathan Dumlao on Unsplash

By Jessie Norriss, Water Policy Fellow

Imagine you are having coffee and suddenly have questions about where the water may have come from, or which utility may be responsible for delivering it to your home, has it had any violations in the last six months? You could turn to Google, but what if there was an official source where you can get all of this information easily in one shot? Turns out there is a federal database, and the answer is that there are more than 1,800 active water systems across Massachusetts. However, decoding who is responsible for managing those systems and how they operate is quite the feat. This blog will address some of the nuance to that process and recommendations for how to improve it going forward.

What is SDWIS and why does this data matter?

In 1994, the EPA launched a federal database called the Safe Drinking Water Information System (SDWIS), which provides a record of all federally-regulated drinking water systems and regulatory violations in water quality. Currently, it has information on 145,610 active and 256,011 inactive water systems across the US. This has been an instrumental dataset for researchers, policy makers, as well as state and federal agencies to understand water quality and distribution systems across the US. It has been used to assess contaminant violations over time, rate of regulation compliance, and the environmental justice implications of contaminant violations.

Figure 1. This is an example of the SDWIS public-facing website showing all active water systems in Massachusetts owned by the state.

Despite its public value, the SDWIS dataset has many limitations that hinder easy analysis. A working paper led by Prof. Janice Beecher at Michigan State University calls attention to the many nuances in this dataset that could lead to ambiguity and faulty interpretation from SDWIS. Here, we highlight their findings of misclassifications, and recommendations to improve the dataset to better serve water policy and management.

Misunderstandings of SDWIS Data

In this report, Prof. Beecher and her team identified several categories that can lead to misunderstandings of data. For example, systems managed under contractual agreements or partnerships in the dataset cannot be determined with any degree of accuracy. There is nowhere in the data to reflect if a Community Water System starts supplying water from another source but still operates independently. Beyond that, the below table highlights some additional ambiguities:

Where can we go from here?

The paper notes that SDWIS should not be a primary source for analysis and for generating policy recommendations about water systems. Not only are there discrepancies in how the data is coded, there are significant inconsistencies in data and variability in reporting between states. Instead, this dataset should be used to identify trends and provide context on where further research is necessary to develop solid policy recommendations.

With some modifications, however, EPA could address these misclassifications. Researchers, policy makers, and regulators would benefit greatly from the implementation of an updated schema proposed by the authors to better evaluate and manage drinking water systems. We would like to highlight a few of their recommendations for improving this database for all users:

  • Adjust the naming conventions and classifications of who manages a system, which would enable greater understanding of where problems lie to inform more targeted recommendations about possible improvements (e.g. distinguish between governmental and non-governmental, rather than public/private, and document the utility name in addition to the system name).
  • Collect data on shared ownership or operators, partnerships, and shared service arrangements for water delivery to better capture the connectivity of water systems.
  • Improve documentation between primary (e.g. water delivery is the systems’ primary function) and ancillary (e.g. water delivery is a service they offer, but not the primary, such as a hospital or military complex) systems as they are often conflated and ancillary systems overstate the population served.
  • Adjust how population served is reported as often: 1) water systems serve transient and non-transient populations, 2) ancillary systems overstate the population served, and 3) wholesalers record their service population as 0 or 1 when it is more nuanced and the details are not recorded. This can lead to large variability in understanding how many people are served by a particular system.

In addition, for each water system, we would like SDWIS to document the nearest water system that would be most feasible for an interconnection should a need arise. Just the exercise of identifying viable neighboring systems will foster a spirit of collaboration, and this information would be greatly beneficial to state administrators and community leaders when discussing system consolidation options. Whether systems are responding to natural disasters, looking to share expertise or operators, or seeking long-term consolidation options, knowing the nearest utility they could connect with could expedite that process at critical times in system management.

It is worth noting that since 2009, the EPA has been working to improve SDWIS, but these efforts have largely focused on the virtual infrastructure for storing the data more efficiently and managing big datasets, and less on how to best classify the information provided by the states to understand water systems and utility management. While SDWIS will remain a central tool for understanding water quality and systems throughout the United States, with relatively small adjustments, this could become an even more useful tool to researchers, regulators, and policy makers.