Skip to content

Raise clear error when load step finds empty transformed extract#223

Merged
dmarulli merged 1 commit into
mainfrom
dmarulli/raise-on-empty-transformed-load
May 12, 2026
Merged

Raise clear error when load step finds empty transformed extract#223
dmarulli merged 1 commit into
mainfrom
dmarulli/raise-on-empty-transformed-load

Conversation

@dmarulli
Copy link
Copy Markdown
Collaborator

Problem

When load_transformed encounters an empty reads.json.gz (or meters.json.gz) at S3, it crashes inside the _default_decoder.decode() chain with JSONDecodeError: Expecting value: line 1 column 1 (char 0). The failure notifier fires, but the alert is cryptic — no signal as to what the underlying state is.

This has been happening every 2 hours on the cadc_crescent-ami-meter-read-dag-backfill-2023-01-01-2026-04-23 DAG since the backfill marched past the vendor's earliest reading (2023-01-01 16:00 UTC). The xylem_datalake adapter queries the account table with no date filter, so the meters file is always non-empty (~7,750 accounts); the water_intervals / water_registers queries are date-filtered, so for pre-floor chunks they return zero rows and the reads file is empty.

Why the existing check doesn't catch this

_calculate_backfill_range at amiadapters/adapters/base.py:652-655 raises "consider removing this backfill" when end <= min_date (where end = MIN(flowtime) from existing readings). For Crescent, MIN(flowtime) = 2023-01-01 16:00 UTC and the configured min_date = 2023-01-01 00:00end > min_date forever, because the vendor's data floor sits 16 hours above the configured floor. The check is structurally incapable of firing for any org whose vendor floor is strictly above its configured min_date. The PR #222 review flagged this completion edge case at the time.

A stateless pre-extract check cannot detect "vendor has no more data" — that signal only exists after asking the vendor. The natural place for the check is post-extract.

Changes

  1. amiadapters/outputs/s3.py and amiadapters/outputs/local.py: read_transformed_meters and read_transformed_meter_reads return [] instead of crashing when the file is empty (just defensive — empty input should never produce JSONDecodeError).
  2. amiadapters/adapters/base.py: in load_transformed, if either meters or reads is empty, raise a clear exception with the decommission CLI suggestion. Fires the existing failure notifier with an alert message that tells the operator what to do.

The exception message handles both completion and outage cases:

  • Backfill at vendor floor → "decommission with python cli.py config remove-backfill ..."
  • Standard / lagged / manual DAG anomaly (vendor outage, auth failure) → "this likely indicates a vendor outage or authentication failure"

Per-adapter safety check

I spot-checked each adapter to make sure no live path produces empty meters+reads under healthy steady-state operation:

Adapter Empty on healthy run?
xylem_datalake Only for pre-floor chunks (Crescent's exact case) — completion signal
aclara No — load_from_file on meters_and_reads.json already errors in transform without allow_empty=True
beacon No — report query errors before load
sentryx No — 0 meters or 0 reads is real anomaly
subeca No — every account has a latestReading
metersense Uncertain — uses allow_empty=True for sub-files. Realistically a 2-day Oracle window with 0 reads on an active utility is highly unlikely
xylem_moulton_niguel Same as metersense
xylem_sensus Not in live production yet (South Tahoe awaiting SFTP creds)

If the metersense / xylem_moulton_niguel adapters end up firing false-positive alerts in practice, the check can be made adapter-aware in a follow-up. The strict check is the safe default — water-meter telemetry is continuous and zero rows in any reasonable window is worth alerting on.

Test plan

  • test/amiadapters/outputs/test_s3.py — 2 new tests for empty-file return-[] behavior
  • test/amiadapters/outputs/test_local.py — 2 new tests for the same in LocalTaskOutputController
  • test/amiadapters/test_base.py — 4 new tests covering load_transformed happy path + empty meters / empty reads / both empty
  • Full unit test suite passes (249 tests)
  • Black formatting check passes
  • After merge + deploy: verify Crescent's next backfill run fails with the new exception text rather than JSONDecodeError, then decommission the backfill DAG

Tolerate empty meters.json.gz / reads.json.gz files in the S3 and local
output controllers (return []), then explicitly raise at the start of
load_transformed if either list is empty. The exception message tells the
operator that the most likely cause is a backfill DAG that has reached the
vendor's data floor, and gives the exact decommission command.

Previously, an empty reads file caused a cryptic JSONDecodeError deep in
json.decoder, which fired the failure notifier without surfacing the actual
cause. The existing pre-extract check at base.py:652 structurally cannot
detect this case when the vendor's earliest reading sits above the
configured backfill min_date, so MIN(flowtime) from existing readings
never reaches min_date and no exception fires.

This is the case Crescent (xylem_datalake) has been hitting since the
backfill DAG marched past 2023-01-01: account list always non-empty,
reads file empty for pre-floor chunks, JSONDecodeError on every run.
@dmarulli dmarulli merged commit 38252f6 into main May 12, 2026
2 checks passed
@dmarulli dmarulli deleted the dmarulli/raise-on-empty-transformed-load branch May 12, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant