WaterQ

Data Sources

WaterQ is built on publicly available federal data. Here is a detailed overview of our data sources and how we use them.

EPA Safe Drinking Water Information System (SDWIS)

Primary Data Source

SDWIS is the EPA's comprehensive database of public water system information and is our primary data source. It contains records for over 150,000 public water systems serving approximately 300 million Americans.

Data We Use

  • Water System Inventory: System names, IDs (PWSID), locations, service populations, water source types, and system classifications
  • Contaminant Test Results: Measured concentrations, test dates, Maximum Contaminant Levels (MCLs), and violation flags
  • Violation Records: Violation types, severity classifications, dates, resolution status, and enforcement actions
Format: REST API / CSV Update: Quarterly Coverage: All US states & territories

USGS Water Quality Portal

Supplementary Source

The Water Quality Portal is a cooperative service by the United States Geological Survey (USGS), the EPA, and the National Water Quality Monitoring Council. It provides access to water quality monitoring data from multiple federal, state, and tribal agencies.

Data We Use

  • Environmental Monitoring: Surface water and groundwater quality measurements from monitoring stations
  • Regional Context: Helps identify potential contamination sources and regional water quality patterns
Format: REST API / WQX Update: Varies by source Coverage: Nationwide monitoring stations

EPA Envirofacts

Supplementary Source

Envirofacts is the EPA's multi-system search tool that provides access to environmental information from across the agency's databases, including facility compliance and enforcement data.

Data We Use

  • Facility Information: Additional details about water treatment facilities and their compliance history
  • Enforcement Actions: Formal enforcement actions taken against non-compliant systems
Format: REST API Update: Varies Coverage: All EPA-regulated facilities

Update Schedule

Data Type Frequency Notes
Water System Inventory Quarterly Jan, Apr, Jul, Oct
Contaminant Test Results Quarterly Aligned with EPA reporting cycles
Violations Quarterly Including resolution status updates
Scores & Grades After each data update Recalculated when new data is available

Data Processing

Raw data from our sources goes through several processing steps before being presented on WaterQ:

  1. 1
    Ingestion

    Raw data is fetched from federal APIs and loaded into our processing pipeline

  2. 2
    Validation

    Records are checked for completeness, format consistency, and data quality

  3. 3
    Normalization

    Data from multiple sources is mapped to a unified schema with consistent units and identifiers

  4. 4
    Scoring

    Water quality scores and grades are calculated using our scoring methodology

  5. 5
    Aggregation

    System-level data is aggregated to city, county, and state levels using population-weighted averages

Data Accuracy

While we take every effort to ensure accuracy, WaterQ relies on data reported by water systems to federal agencies. Reporting delays, data entry errors, and testing gaps may affect the completeness of the information presented. If you believe any data is incorrect, please contact us so we can investigate.