Skip to content

Handle pending helm updates and installs#2024

Open
bec-callow-oct wants to merge 4 commits into
mainfrom
bec/hpy-1416-handle-pending-helm
Open

Handle pending helm updates and installs#2024
bec-callow-oct wants to merge 4 commits into
mainfrom
bec/hpy-1416-handle-pending-helm

Conversation

@bec-callow-oct

@bec-callow-oct bec-callow-oct commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

⚠️ Does this change require a corresponding Server Change?
⚠️ If so - please add a "Requires Server Change" label to this PR!

Background

Fixes: OctopusDeploy/Issues#10081

When cancelling a helm deployment partway through the Helm release, then trying to deploy again, it will sometimes show Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress. When helm deployments are terminated, they are placed in pending-upgrade or pending-install status. Helm requires these to be cleared via uninstall or rollback.

This PR adds handling to the beginning of helm deployments to detect and recover from stuck releases. I chose not to include the cleanup at the end of a deployment because it might prevent users from being able to investigate failures.

Results

Before

image

After

image image

@bec-callow-oct bec-callow-oct force-pushed the bec/hpy-1416-handle-pending-helm branch 2 times, most recently from 53c5c6f to 480fd77 Compare June 29, 2026 06:11
@bec-callow-oct bec-callow-oct force-pushed the bec/hpy-1416-handle-pending-helm branch from 480fd77 to f011729 Compare June 30, 2026 05:04
log.Warn($"Release {releaseName} is stuck in {status} state, likely from a cancelled first install. Uninstalling to recover...");
var uninstallResult = helmCli.Uninstall(releaseName);
if (uninstallResult.ExitCode != 0)
log.Warn($"Uninstall returned non-zero exit code {uninstallResult.ExitCode}. Continuing with upgrade...");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upgrade is likely to fail, but we continue just in case and allow the deployment to fail as normal

log.Warn($"Release {releaseName} is stuck in {status} state, likely from a cancelled deployment. Rolling back to recover...");
var rollbackResult = helmCli.Rollback(releaseName);
if (rollbackResult.ExitCode != 0)
log.Warn($"Rollback returned non-zero exit code {rollbackResult.ExitCode}. Continuing with upgrade...");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, the deployment is likely to fail after this

executor.ExecuteHelmUpgrade(deployment, releaseName, newRevisionNumber, new CancellationTokenSource(), new CancellationTokenSource());
return;
}
var expectedRevisionNumber = (currentMetadata?.Revision ?? 0) + 1;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following an uninstall, helm returns to revision 1 for the next release. After rollback the next revision number is assigned as normal

@bec-callow-oct bec-callow-oct force-pushed the bec/hpy-1416-handle-pending-helm branch 2 times, most recently from 6548d21 to 12deb05 Compare June 30, 2026 06:36
@bec-callow-oct bec-callow-oct requested a review from Copilot June 30, 2026 06:36
@bec-callow-oct bec-callow-oct marked this pull request as ready for review June 30, 2026 06:37

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds pre-upgrade recovery logic for Helm releases that are left in pending-install / pending-upgrade after a cancelled deployment, so subsequent deployments don’t fail with “another operation … is in progress”.

Changes:

  • Extend Helm metadata retrieval to include both revision and status.
  • Add recovery actions before upgrade: uninstall for pending-install, rollback for pending-upgrade.
  • Add unit tests covering the pending-state recovery behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
source/Calamari/Kubernetes/Integration/HelmCli.cs Adds get metadata status parsing plus rollback/uninstall helpers; adjusts upgrade argument handling.
source/Calamari/Kubernetes/Conventions/HelmUpgradeWithKOSConvention.cs Calls recovery logic before starting the parallel helm upgrade + manifest/status monitoring tasks.
source/Calamari/Kubernetes/Conventions/Helm/HelmUpgradeExecutor.cs Introduces RecoverFromPendingRelease to clear stuck pending releases before upgrade.
source/Calamari.Tests/KubernetesFixtures/Conventions/Helm/HelmUpgradeWithKOSConventionTests.cs New tests for rollback/uninstall behavior when metadata reports pending states.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread source/Calamari/Kubernetes/Conventions/Helm/HelmUpgradeExecutor.cs Outdated
Comment thread source/Calamari/Kubernetes/Conventions/Helm/HelmUpgradeExecutor.cs
Comment thread source/Calamari/Kubernetes/Integration/HelmCli.cs
@bec-callow-oct bec-callow-oct force-pushed the bec/hpy-1416-handle-pending-helm branch from 12deb05 to 18c021e Compare June 30, 2026 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error occurs when trying to deploy again after cancelling an in progress helm deployment

2 participants