Skip to content

OCPBUGS-77924: cleanup orphaned boot entries in IPC#6054

Open
danmanor wants to merge 1 commit into
openshift-kni:mainfrom
danmanor:cleanup-extra-boot-entry
Open

OCPBUGS-77924: cleanup orphaned boot entries in IPC#6054
danmanor wants to merge 1 commit into
openshift-kni:mainfrom
danmanor:cleanup-extra-boot-entry

Conversation

@danmanor
Copy link
Copy Markdown
Member

@danmanor danmanor commented Apr 21, 2026

Summary

  • Add removeOrphanedBootEntries() to the IPC idle handler cleanup flow, which removes /boot/ostree/ directories that don't match any stateroot listed in
    rpm-ostree deployments
  • This prevents storage exhaustion on /boot caused by stale boot entries (e.g. from the seed image or IBI) that have no corresponding deployed stateroot
  • Fix pre-existing test failures in ipc_idle_handlers_test.go by overriding the osReadDir package variable for tests that exercise
    CleanupUnbootedStateroots

Details

On RHCOS formatted disks there is only room for 2 boot entries at the same time. We observed cases where an old boot entry from the seed image (dated 2022)
persisted in /boot/ostree/ with no matching stateroot in ostree admin status. This left no space for the IP Configuration flow to create its target
stateroot, causing failures in the pre-pivot phase.

The existing removeBootDirsByStaterootPrefixes() only cleans up boot entries for stateroots that appear as unbooted deployments in rpm-ostree. It does
not handle entries whose stateroot was already fully removed from all deployments (i.e. truly orphaned entries). The new removeOrphanedBootEntries()
function complements it by querying all deployed stateroots and removing any /boot/ostree/ directory that doesn't match any of them.

Test plan

  • New unit tests for removeOrphanedBootEntries: basic removal, missing /boot/ostree/ dir, multiple deployments
  • Updated mock expectations in existing integration tests for additional QueryStatus/ReadDir calls
  • All TestIPCIdleStageHandler_Handle subtests pass
  • Linter passes (make golangci-lint)

@openshift-ci-robot
Copy link
Copy Markdown

@danmanor: This pull request references Jira Issue OCPBUGS-77924, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@danmanor danmanor changed the title OCPBUGS-77924: cleanup orphaned boot entries in IPC WIP: OCPBUGS-77924: cleanup orphaned boot entries in IPC Apr 21, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@danmanor: This pull request references Jira Issue OCPBUGS-77924, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • Add removeOrphanedBootEntries() to the IPC idle handler cleanup flow, which removes /boot/ostree/ directories that don't match any stateroot listed in
    rpm-ostree deployments
  • This prevents storage exhaustion on /boot caused by stale boot entries (e.g. from the seed image or IBI) that have no corresponding deployed stateroot
  • Fix pre-existing test failures in ipc_idle_handlers_test.go by overriding the osReadDir package variable for tests that exercise
    CleanupUnbootedStateroots

Details

On RHCOS formatted disks there is only room for 2 boot entries at the same time. We observed cases where an old boot entry from the seed image (dated 2022)
persisted in /boot/ostree/ with no matching stateroot in ostree admin status. This left no space for the IP Configuration flow to create its target
stateroot, causing failures in the pre-pivot phase.

The existing removeBootDirsByStaterootPrefixes() only cleans up boot entries for stateroots that appear as unbooted deployments in rpm-ostree. It does
not handle entries whose stateroot was already fully removed from all deployments (i.e. truly orphaned entries). The new removeOrphanedBootEntries()
function complements it by querying all deployed stateroots and removing any /boot/ostree/ directory that doesn't match any of them.

Test plan

  • New unit tests for removeOrphanedBootEntries: basic removal, missing /boot/ostree/ dir, multiple deployments
  • Updated mock expectations in existing integration tests for additional QueryStatus/ReadDir calls
  • All TestIPCIdleStageHandler_Handle subtests pass
  • Linter passes (make golangci-lint)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@danmanor danmanor force-pushed the cleanup-extra-boot-entry branch from de372b7 to 3ff99d4 Compare April 21, 2026 09:57
@danmanor danmanor changed the title WIP: OCPBUGS-77924: cleanup orphaned boot entries in IPC OCPBUGS-77924: cleanup orphaned boot entries in IPC Apr 23, 2026
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 23, 2026
@danmanor
Copy link
Copy Markdown
Member Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 23, 2026
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 1, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jc-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants