[release-4.19] Avoid using lb-ext.kubeconfig for seedgen cleanup#6009
Conversation
# Background During cleanup before seed generation, we want to delete all pods in the Succeeded or Failed phase. # Issue The current implementation uses `oc delete pod --all-namespaces --field-selector=status.phase==...` with the node `lb-ext.kubeconfig` as a kubeconfig. This is in contrast to the rest of the controller, which uses the in-cluster config via a client loaded with ctrl.GetConfigOrDie(). We've had users report TLS errors when the controller tries to use the lb-ext.kubeconfig, which seems expected since that kubeconfig is meant for external use and may not have the correct CA or credentials for in-cluster API access. Before they hit that error, the client has already been used, so it proves that the in-cluster config works, and this is specifically an issue with using the lb-ext.kubeconfig for this cleanup step. # Solution The solution is to avoid using the lb-ext.kubeconfig for this cleanup step, and instead use the same in-cluster client that the rest of the controller uses. # Caveat As the existing TODO states: ```go // TODO: Can this be done cleanly via client? The client.DeleteAllOf seems to require a specified namespace, so maybe loop over the namespaces ``` The Kubernetes API does not support DeleteCollection without a namespace for namespaced resources, so we cannot simply use DeleteAllOf with no namespace. Instead, we copy the behavior of kubectl when using `delete --all-namespaces`, which is to list the resources with the field selector, and then delete them one by one. This is less efficient than a single DeleteCollection call, but it's already what kubectl was doing when we were calling it, so we know it works and is acceptable for this use case. # Implementation Details We cannot use `client.MatchingFields` because we do not have an index set up and it seems overkill to do so. # Agent Interaction Agent assisted. The conversation went something like this: - Where does r.Client get its kubeconfig? → ctrl.GetConfigOrDie() → in-cluster SA - Where does the oc call get its kubeconfig? → lb-ext.kubeconfig on the host via chroot executor - Can we generate a kubeconfig from rest.Config? → no built-in utility, hand-rolled versions kept being incomplete - Just use DeleteAllOf with no namespace? → tried it, in a toy project using `sigs.k8s.io/controller-runtime/pkg/envtest`, API rejects it (no cluster-scoped DeleteCollection for namespaced resources) - Inspect kubectl source - what does it do when delete --all-namespaces? → list + delete one by one, sequentially - Did that, modified toy project to prove it works, updated the controller, added comment explaining the reasoning
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test ibi-e2e-flow ibu-e2e-flow |
6 similar comments
|
/test ibi-e2e-flow ibu-e2e-flow |
|
/test ibi-e2e-flow ibu-e2e-flow |
|
/test ibi-e2e-flow ibu-e2e-flow |
|
/test ibi-e2e-flow ibu-e2e-flow |
|
/test ibi-e2e-flow ibu-e2e-flow |
|
/test ibi-e2e-flow ibu-e2e-flow |
|
/test ibi-e2e-flow |
|
@openshift-cherrypick-robot: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Obsoleted by #6881 |
This is an automated cherry-pick of #6008
/assign jc-rh
/cherrypick release-4.18 release-4.17 release-4.16