-
Notifications
You must be signed in to change notification settings - Fork 116
Let agent signal step failure via a tagged block #2045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
83fd5b7
863e7cf
e9f1247
9c37036
44f53ff
a3d4075
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| --- | ||
| name: octopus-fail-deployment | ||
| description: Use when the user's prompt asks for the step to FAIL under some condition — e.g. "fail the deployment if the health check is red", "if X happens, fail this step", "the runbook should fail when Y". By default an agent run always succeeds (the process exits 0), so the ONLY way to make Octopus mark this step as failed is to emit the sentinel described here. Do NOT use when the user has not expressed any failure condition — absence of the sentinel means success. | ||
| --- | ||
| By default this step **succeeds** — when your run finishes normally, Octopus marks the step green regardless of what you found. | ||
|
|
||
| If the user's prompt states a condition under which the step should **fail** (for example "fail the deployment if the smoke test doesn't pass"), and you determine that condition has been met, you must explicitly signal the failure. The only way Octopus can detect this from the outside is a specific tagged block in your final response. | ||
|
|
||
| ## How to signal failure | ||
|
|
||
| Emit this block as part of your **final** message, with the reason between the tags: | ||
|
|
||
| <octopus-task-failed> | ||
| A short reason describing why the step failed. | ||
| </octopus-task-failed> | ||
|
|
||
| For example: | ||
|
|
||
| <octopus-task-failed> | ||
| Smoke test returned HTTP 500 from /health after 3 retries. | ||
| </octopus-task-failed> | ||
|
|
||
| ## Rules | ||
|
|
||
| - Emit the block **only** when the user expressed a failure condition AND you have determined it is met. If the condition was not met, say nothing special and let the step succeed. | ||
| - Always write a **complete** block — either a paired block ending in `</octopus-task-failed>` or a self-closing `<octopus-task-failed/>`. A closed tag is how Octopus confirms the message is whole — if you open the block but stop before closing it, the failure will not be detected, so finish the block before ending your turn. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RE: self closing block I would be tempted to remove that as an option. Makes this whole thing more consistent and the matching regex simpler. I'm a bit concerned about removing flexibility that Claude might like to take advantage of however.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting you say that. I originally had it not allow closing tags but then Claude pointed out it might make it more likely to use it when there happens to be no specific reason to apply.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's reason enough if Claude thinks it might be a problem without it. |
||
| - Put the tags on their **own lines**, as plain text. Do **not** wrap them in backticks, code fences, bold, or any other markdown. | ||
| - Emit the block **once**. One block is enough to fail the step. | ||
| - Keep the reason **concise and specific** — it is surfaced in the Octopus task log as the failure message, so write it for the operator who will read it. The reason may span multiple lines. | ||
| - The reason is optional but strongly encouraged; an empty block — or a self-closing `<octopus-task-failed/>` — will still fail the step with a generic message. | ||
| - If you cannot determine whether the condition was met, do not guess silently — explain what you found. Only emit the block if the user's intent was that an unverifiable outcome should fail the step. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,7 @@ | |
| using Calamari.Common.Features.Behaviours; | ||
| using Calamari.Common.Plumbing.Extensions; | ||
| using Calamari.Common.Plumbing.FileSystem; | ||
| using Calamari.Common.Plumbing.Logging; | ||
| using Calamari.Common.Plumbing.Variables; | ||
|
|
||
| namespace Calamari.Common.Plumbing.Pipeline | ||
|
|
@@ -51,6 +52,8 @@ protected virtual IEnumerable<IOnFinishBehaviour> OnFinish(OnFinishResolver reso | |
| public async Task Execute(ILifetimeScope lifetimeScope, IVariables variables) | ||
| { | ||
| var pathToPrimaryPackage = variables.GetPathToPrimaryPackage(lifetimeScope.Resolve<ICalamariFileSystem>(), false); | ||
| var log = lifetimeScope.Resolve<ILog>(); | ||
|
|
||
| var deployment = new RunningDeployment(pathToPrimaryPackage, variables); | ||
|
|
||
| try | ||
|
|
@@ -67,7 +70,7 @@ public async Task Execute(ILifetimeScope lifetimeScope, IVariables variables) | |
| } | ||
| catch (Exception installException) | ||
| { | ||
| Console.Error.WriteLine("Running rollback behaviours..."); | ||
| log.Verbose("Running rollback behaviours..."); | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was annoying as hell. There is no rollback behaviour taking place, so why make it so prominent. |
||
|
|
||
|
zentron marked this conversation as resolved.
|
||
| deployment.Error(installException); | ||
|
|
||
|
|
@@ -78,7 +81,7 @@ public async Task Execute(ILifetimeScope lifetimeScope, IVariables variables) | |
| } | ||
| catch (Exception rollbackException) | ||
| { | ||
| Console.Error.WriteLine(rollbackException); | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should generally not be writing to Console in Calamari as it makes testing more difficult |
||
| log.Error(rollbackException.PrettyPrint()); | ||
| } | ||
|
zentron marked this conversation as resolved.
|
||
|
|
||
| throw; | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.