Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I think we should rename One thing to consider here is what we'll do once we have alerts based on failure rate and the associated event type.
Yes, your proposition makes more sense
No strong opinion, but the current payload does lack |
yes, let's do
that's just the computed value of also do you know |
|
I would represent the threshold itself which may not line up with the current / max exactly. |
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
TDD setup - tests will pass once feature is implemented. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Send alert when destination is auto-disabled after reaching consecutive failure threshold. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…bject Replace flat fields (ConsecutiveFailures, MaxConsecutiveFailures, Progress, WillDisable) with a scoped ConsecutiveFailures struct containing Current, Max, and Threshold. This produces a cleaner JSON payload structure and removes the redundant WillDisable field (threshold == 100 implies disable). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace ConsecutiveFailures struct in DestinationDisabledData with a Reason string field. This decouples the disabled alert from the specific trigger mechanism, allowing future disable reasons (e.g., error rate) without restructuring the payload. Also update e2e types and assertions for the nested ConsecutiveFailures struct introduced in the previous commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d 100 After disabling a destination, update the consecutive failure alert's destination to include DisabledAt, so consumers see the post-disable state. Add test asserting this behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use generic "failed to send alert" / "alert sent" messages with a topic field instead of alert-type-specific message strings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@alexbouchardd updated, the PR description is up-to-date if you want to review the schema, etc. |
Changes
Alert Payload Schema
consecutive_failuresinto a nested object withcurrent,max,thresholdwill_disablefield (threshold == 100 implies disable)MarshalJSONmethods in favor of standardjson.Marshaltenant_idto top-level of alert dataAlertDestinationwithfilter,metadata,updated_atDeliveryAttemptstruct: passEvent,Destination,AttemptdirectlyNew
alert.destination.disabledCallbackDisabledAtis set on returned destinationreasonfield (e.g.,"consecutive_failure") instead of embedding trigger-specific data, keeping the payload extensible for future disable mechanisms (e.g., error rate)alert.destination.disabledandalert.destination.consecutive_failure(withthreshold: 100) are sent. The disabled alert is sent first, but ordering is not guaranteed.Consecutive Failure Alert at Threshold 100
disabled_at)Error Handling
DestinationDisablerreturns disabled destination for timestamp consistencyNotification Delivery
Alert notifications are sent synchronously via HTTP within
HandleAttempt, which itself is called asynchronously from the delivery pipeline (go h.handleAlertAttempt(...)). This means notifications don't block event delivery, but a slow notification (e.g., the disabled alert) will delay subsequent notifications (e.g., the consecutive failure alert) within the same handler call.Future consideration: If notification latency becomes an issue, we could decouple the two notifications — either by sending them concurrently or by introducing a notification queue. For now, the synchronous approach keeps the implementation simple and the best-effort error handling makes the current behavior acceptable.
Alert Payloads
eventis a fullmodels.Eventandattemptis a fullmodels.Attempt.alert.destination.consecutive_failure{ "topic": "alert.destination.consecutive_failure", "timestamp": "2025-01-15T10:30:00Z", "data": { "tenant_id": "tenant_123", "event": { ... }, "attempt": { ... }, "consecutive_failures": { "current": 10, "max": 20, "threshold": 50 }, "destination": { "id": "dest_xyz", "tenant_id": "tenant_123", "type": "webhook", "topics": ["*"], "filter": {}, "config": {}, "metadata": {}, "created_at": "2025-01-01T00:00:00Z", "updated_at": "2025-01-01T00:00:00Z", "disabled_at": null } } }alert.destination.disabled{ "topic": "alert.destination.disabled", "timestamp": "2025-01-15T10:30:00Z", "data": { "tenant_id": "tenant_123", "disabled_at": "2025-01-15T10:30:00Z", "reason": "consecutive_failure", "event": { ... }, "attempt": { ... }, "destination": { "id": "dest_xyz", "tenant_id": "tenant_123", "type": "webhook", "topics": ["*"], "filter": {}, "config": {}, "metadata": {}, "created_at": "2025-01-01T00:00:00Z", "updated_at": "2025-01-15T10:30:00Z", "disabled_at": "2025-01-15T10:30:00Z" } } }