Skip to content

Resize pods in place on resource-only template changes#50

Open
tamalsaha wants to merge 2 commits into
masterfrom
inplace-vertical-scaling
Open

Resize pods in place on resource-only template changes#50
tamalsaha wants to merge 2 commits into
masterfrom
inplace-vertical-scaling

Conversation

@tamalsaha

@tamalsaha tamalsaha commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What

Teaches the PetSet controller to actuate a resource-only template change in place via the Kubernetes pods/resize subresource instead of deleting and recreating the pod. Generic infrastructure — every KubeDB database that uses PetSet inherits it.

  • onlyResourcesDiffer renders the pod's current revision and the update revision (via ApplyRevision), zeroes resources on both, and deep-compares the whole PodSpec. Comparing two rendered revisions (not the live pod vs a template) cancels apiserver defaulting noise; comparing the full PodSpec (not just container slices) guarantees any non-resource change (volumes, affinity, tolerations, nodeSelector, securityContext, labels, ...) is ineligible and is never silently dropped.
  • ResizeStatefulPod issues UpdateResize, waits for kubelet actuation (the PodResizePending/PodResizeInProgress conditions and ContainerStatuses[].Resources), then patches the pod's controller-revision-hash label to the update revision so revision accounting converges without a restart.
  • Wired into both rolling-update sites (updatePetSet and updatePetSetAfterInvariantEstablished). In the MaxUnavailable path the in-place resize consumes the maxUnavailable budget, so a potentially disruptive resize (a RestartContainer resizePolicy or a readiness blip) cannot hit every stale pod at once.
  • Falls back to delete-and-recreate when the resize is unsupported (isResizeUnsupported: pods/resize absent, or the cluster's InPlacePodVerticalScaling gate off) or reported Infeasible; transient errors requeue (never delete).
  • Behind the InPlaceVerticalScaling feature gate (Beta, default on). With the gate off, behavior is byte-for-byte the current delete-and-recreate.

Tests

onlyResourcesDiffer truth table including non-container changes (nodeSelector, volume) being ineligible even alongside a resource change; resizeState transitions; isResizeUnsupported matrix; gate on/off. Full controller suite green.

Part of a coordinated set

Foundation PR (independent of apimachinery). Used at runtime by kubedb/postgres#911. Safe to ship on any cluster — unsupported clusters fall back.

When a pod differs from the update revision only in container or pod-level
resources, resize the running pod via the pods/resize subresource instead
of deleting it, then relabel it to the update revision so revision
accounting converges without a restart. Gated by the InPlaceVerticalScaling
feature gate (default on); falls back to delete-and-recreate when the
resize is unsupported or reported infeasible.

Signed-off-by: Tamal Saha <tamal@appscode.com>
kodiakhq[bot]
kodiakhq Bot previously approved these changes Jun 29, 2026
Two correctness fixes to the in-place resize path:

- onlyResourcesDiffer now renders the pod's current revision and the update
  revision and compares the WHOLE pod spec (resources zeroed), instead of only
  the container slices against the live pod. This prevents a revision that
  changes a non-container field (volumes, affinity, tolerations, nodeSelector,
  securityContext, ...) from being misclassified as resources-only and resized +
  relabeled in place, which would silently drop that change.
- The MaxUnavailable update path now counts an in-place resize against the
  maxUnavailable budget (deletedPods++), so a potentially disruptive resize
  (RestartContainer resizePolicy or a readiness blip) cannot be applied to every
  stale pod in a single pass.

Threads currentRevision through updatePetSetAfterInvariantEstablished. Adds tests
covering non-container template changes.

Signed-off-by: Tamal Saha <tamal@appscode.com>
@tamalsaha tamalsaha changed the title Resize pods in place on resource-only template change Resize pods in place on resource-only template changes Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant