Retry processing needs clearer observability so operators can tell when failures are stuck, recurring, or exhausting retries.
Acceptance criteria:
- Emit a notification or alert when retries fail repeatedly.
- Expose queue depth or backlog metrics.
- Make the failure path visible in logs and dashboards.
Retry processing needs clearer observability so operators can tell when failures are stuck, recurring, or exhausting retries.
Acceptance criteria: