Skip to content

fix(watchdog): add s3:ListBucket on _alerts/_dedup prefix (L295)#331

Merged
cipher813 merged 1 commit into
mainfrom
feat-l295-watchdog-listbucket-260527
May 27, 2026
Merged

fix(watchdog): add s3:ListBucket on _alerts/_dedup prefix (L295)#331
cipher813 merged 1 commit into
mainfrom
feat-l295-watchdog-listbucket-260527

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

`alpha_engine_lib.alerts.publish`'s S3-backed dedup uses HeadObject + conditional PUT semantics that require `s3:ListBucket` on the bucket itself (scoped via `s3:prefix` condition). The watchdog role previously had only `s3:GetObject` + `s3:PutObject` on the dedup-marker prefix; the missing ListBucket caused dedup probes to error with AccessDenied, and the lib's fail-safe-to-publish path correctly fired the alert anyway — but with dedup non-functional, every cron firing during a persistent outage re-paged the operator instead of collapsing under the 12h dedup window.

Scoping

The grant is scoped by an `s3:prefix` Condition to `_alerts/_dedup` + `_alerts/_dedup/*` so the watchdog can't enumerate other prefixes on the bucket. Mirrors the alpha-engine-data #327 (L258) precedent where `ArcticDBSmokeListBucket` added the same Condition-scoped grant.

Live state

Already applied via `aws iam put-role-policy` against `alpha-engine-pipeline-watchdog-role` — verified live. The deploy.sh re-applies the policy on every run, so the codified JSON is in lockstep with AWS.

Test plan

  • IAM JSON validates as well-formed JSON
  • Live policy applied + verified (`DedupMarkerListBucket` Sid present with correct prefix Condition)
  • IAM drift check will pass on next run
  • First post-merge watchdog cron firing's CW logs should NOT show `dedup marker check errored`
  • Subsequent firings on a persistent simulated outage should dedup correctly (no re-pages within the 12h window)

ROADMAP L295 (P2, 2026-05-26 PM audit finding).

🤖 Generated with Claude Code

alpha_engine_lib.alerts.publish's S3-backed dedup uses HeadObject +
conditional PUT semantics that require s3:ListBucket on the bucket
itself (scoped via the s3:prefix condition). The watchdog role
previously had only s3:GetObject + s3:PutObject on the dedup-marker
prefix; the missing ListBucket caused dedup probes to error with
AccessDenied, and the lib's fail-safe-to-publish path correctly fired
the alert anyway — but with dedup non-functional, every cron firing
during a persistent outage re-paged the operator instead of collapsing
under the 12h dedup window.

The grant is scoped by an s3:prefix Condition to ``_alerts/_dedup`` +
``_alerts/_dedup/*`` so the watchdog can't enumerate other prefixes
on the bucket. Mirrors the alpha-engine-data #327 (L258) precedent
where ArcticDBSmokeListBucket added the same Condition-scoped grant
for the OIDC role's ArcticDB smoke.

Applied via ``aws iam put-role-policy`` against
``alpha-engine-pipeline-watchdog-role`` (verified live). The deploy.sh
already re-applies the policy on every run, so the codified change
is in lockstep with the live state.

ROADMAP L295 (P2, 2026-05-26 PM audit finding).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 1a8aa8f into main May 27, 2026
1 check passed
@cipher813 cipher813 deleted the feat-l295-watchdog-listbucket-260527 branch May 27, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant