fix(conformance): bring test suite to green baseline#32
Open
indyjonesnl wants to merge 7 commits into
Open
Conversation
… preempt, endpoints Knock down four residual v1.35 conformance gaps in the unit-14 slice: - GarbageCollector: orphans are now reaped in a single scan. The owner is already re-verified inside `delete_orphan` (mirroring K8s attemptToDeleteItem → getObject), so the prior 2-scan grace was pure added latency that conformance probes for orphan-pod cleanup observed as a failure. - ResourceQuota: the controller now watches pods, services, configmaps, secrets and PVCs in addition to ResourceQuota itself. Lifecycle events on tracked resources immediately re-enqueue every quota in the affected namespace, so status.used reflects pod create/delete without waiting for the 30s resync. - DaemonSet RollingUpdate: maxUnavailable is now resolved as a true IntOrString. Percentages are scaled against the desired pod count and rounded UP per `intstr.GetScaledValueFromIntOrPercent(roundUp=true)`. Previously "25%" was parsed as 25 absolute, allowing the whole fleet to be deleted in one reconcile on small clusters. - StatefulSet eviction/scale-down: status.replicas and status.readyReplicas now exclude terminating pods. Scale-down and PDB-driven eviction tests expect the counts to decrement the moment deletionTimestamp is set (graceful termination), not only when the pod is fully removed. Items inspected and confirmed already correct (no change shipped): - list chunking + compaction 410 response (already returns fresh continueToken in Status.metadata). - HostPort conflict detection (scheduler `check_host_port_conflicts` correctly handles wildcard hostIPs and protocol overlap). - Preemption running path (scheduler `check_preemption` already considers Running pods as victims and uses K8s "remove all, then reprieve"). - Endpoints latency (controller is watch-driven on pods+services; reconcile does not write the service it watches, so the workqueue cooldown does not affect endpoint-update latency). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DaemonSet pods are named '<ds>-<random>' via generateName (not '<ds>-<node>-<random>'). A unique suffix per pod is required by the 'should retry creating failed daemon pods' conformance test, which asserts the failed pod's name returns NotFound after replacement. Test expectation predates the controller change; updating it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n EndpointSlices K8s endpointslice controller keeps terminating pods (deletionTimestamp set) in the slice with terminating=true, ready=false so kube-proxy can drain them gracefully. Dropping them on the first reconcile broke the "serve" conformance tests that race-check endpoint serving during rolling updates. Also honor publishNotReadyAddresses on the Service spec: when set, all endpoints get ready=true and serving=true regardless of pod Ready state. This matches K8s endpointslice utils.podEndpointConditions semantics and is required by headless services fronting peer-discovery protocols. Additional fixes: - Mirror EndpointSlices under the bare Endpoints name (matching K8s convention) instead of always appending "-mirrored", falling back to the suffix only when a selector-based slice already owns the bare name. - cleanup_orphans() now actually deletes orphaned slices when the owning Service / Endpoints disappears (was a no-op stub). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GC was deliberately changed to drop namespace cascade (NamespaceController owns it now; force-deleting through GC raced with finalizers). The test still exercises the removed code path; mark ignored with a pointer to the namespace_controller_test where the behavior is now verified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NodeController applies a K8s-standard 60s startup grace before flipping Ready conditions on a freshly-seen node. Two integration tests (test_node_without_ready_condition, test_node_not_ready_with_old_heartbeat) expected the condition to flip on the first reconcile — within the grace window — and were therefore always red. Add a #[doc(hidden)] seed_first_seen_for_test() that backdates a node's first_seen entry so reconcile treats the node as past the grace. Call it from both tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…codeString The controller uses K8s SafeEncodeString (consonant+digit alphabet, variable length per FNV digit), not 10-char hex. Test assertion predated this; update to reflect actual format and check the SafeEncodeString alphabet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
StatefulSet reconcile only sets deletionTimestamp; the actual pod removal from storage is the kubelet's job. Two integration tests (scales_down_reverse_order, rolling_update_changes_image) assert on storage pod count and were therefore always red. Add a private simulate_kubelet_cleanup helper that removes any pod with deletionTimestamp, and call it between reconcile cycles. Mirrors the helper already present in the src/controllers/statefulset.rs unit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 14, 2026
Open
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Goal
Make
cargo test --workspacepass onmainso subsequent PRs can prove regressions clearly. Without this, 9–11 baseline failures hide real signal.Changes
Controller fixes (2 commits):
fix(conformance): misc sweep— GC drops the 2-scan grace period for orphan deletion (delete_orphan already re-verifies owners per K8sattemptToDeleteItem); ResourceQuota watches pods/services/configmaps/secrets/PVCs; DaemonSet RollingUpdate resolvesmaxUnavailableas IntOrString with round-up percent scaling; StatefulSetstatus.replicasandreadyReplicasexclude terminating pods.fix(controller-manager): track terminating pods on EndpointSlices— terminating pods stay in the slice withterminating=true, ready=falseso kube-proxy can drain them; honorspublishNotReadyAddresses.Test corrections (5 commits) — assertions that drifted from upstream behavior:
test(daemonset): align pod-name assertion with K8s generateName— DS pods are<ds>-<random>, not<ds>-<node>-<random>.test(gc): ignore obsolete namespace-cascade test— GC no longer cascades namespaces (NamespaceController owns it).test(node): bypass 60s startup grace via seed_first_seen_for_test— tests that exercise Ready-flip behavior need to skip the K8s-standard 60s startup grace.test(controllerrevision): align hash-suffix assertion with K8s SafeEncodeString— controller emits SafeEncodeString alphabet, not hex.test(statefulset): simulate kubelet cleanup between reconciles— graceful-delete tests need a kubelet simulator helper since reconcile only setsdeletionTimestamp.Verification
Local on this branch:
Why these two themes are bundled
The 5 test corrections and the 2 controller fixes are coupled — separately, each side leaves part of the suite red:
Bundling them lets reviewers see a single PR that goes from red to green.