Handle pending helm updates and installs#2024
Conversation
53c5c6f to
480fd77
Compare
480fd77 to
f011729
Compare
| log.Warn($"Release {releaseName} is stuck in {status} state, likely from a cancelled first install. Uninstalling to recover..."); | ||
| var uninstallResult = helmCli.Uninstall(releaseName); | ||
| if (uninstallResult.ExitCode != 0) | ||
| log.Warn($"Uninstall returned non-zero exit code {uninstallResult.ExitCode}. Continuing with upgrade..."); |
There was a problem hiding this comment.
The upgrade is likely to fail, but we continue just in case and allow the deployment to fail as normal
| log.Warn($"Release {releaseName} is stuck in {status} state, likely from a cancelled deployment. Rolling back to recover..."); | ||
| var rollbackResult = helmCli.Rollback(releaseName); | ||
| if (rollbackResult.ExitCode != 0) | ||
| log.Warn($"Rollback returned non-zero exit code {rollbackResult.ExitCode}. Continuing with upgrade..."); |
There was a problem hiding this comment.
As above, the deployment is likely to fail after this
| executor.ExecuteHelmUpgrade(deployment, releaseName, newRevisionNumber, new CancellationTokenSource(), new CancellationTokenSource()); | ||
| return; | ||
| } | ||
| var expectedRevisionNumber = (currentMetadata?.Revision ?? 0) + 1; |
There was a problem hiding this comment.
Following an uninstall, helm returns to revision 1 for the next release. After rollback the next revision number is assigned as normal
6548d21 to
12deb05
Compare
There was a problem hiding this comment.
Pull request overview
Adds pre-upgrade recovery logic for Helm releases that are left in pending-install / pending-upgrade after a cancelled deployment, so subsequent deployments don’t fail with “another operation … is in progress”.
Changes:
- Extend Helm metadata retrieval to include both revision and status.
- Add recovery actions before upgrade: uninstall for
pending-install, rollback forpending-upgrade. - Add unit tests covering the pending-state recovery behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
source/Calamari/Kubernetes/Integration/HelmCli.cs |
Adds get metadata status parsing plus rollback/uninstall helpers; adjusts upgrade argument handling. |
source/Calamari/Kubernetes/Conventions/HelmUpgradeWithKOSConvention.cs |
Calls recovery logic before starting the parallel helm upgrade + manifest/status monitoring tasks. |
source/Calamari/Kubernetes/Conventions/Helm/HelmUpgradeExecutor.cs |
Introduces RecoverFromPendingRelease to clear stuck pending releases before upgrade. |
source/Calamari.Tests/KubernetesFixtures/Conventions/Helm/HelmUpgradeWithKOSConventionTests.cs |
New tests for rollback/uninstall behavior when metadata reports pending states. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
12deb05 to
18c021e
Compare
Background
Fixes: OctopusDeploy/Issues#10081
When cancelling a helm deployment partway through the Helm release, then trying to deploy again, it will sometimes show
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress.When helm deployments are terminated, they are placed inpending-upgradeorpending-installstatus. Helm requires these to be cleared via uninstall or rollback.This PR adds handling to the beginning of helm deployments to detect and recover from stuck releases. I chose not to include the cleanup at the end of a deployment because it might prevent users from being able to investigate failures.
Results
Before
After