Add Machine API Integration for bootstrap Path#133
Conversation
37a8ff6 to
ff90954
Compare
ceb0cf3 to
40eb2ac
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new aksmachine component (action + server implementation) intended to register the local host as an AKS “Machine” during bootstrap, and upgrades the Azure ContainerService ARM SDK dependency to v8 to access the required APIs. The machine-registration bootstrap step is present but intentionally not yet wired into the bootstrap sequence (commented out) until the RP side is live.
Changes:
- Upgrade
armcontainerserviceSDK usage across the repo fromv5tov8(plus correspondinggo.mod/go.sumupdates). - Introduce the new
components/aksmachineaction proto, generated code, redaction, andv20260301server implementation + unit tests. - Add an
enabledflag to the Arc install action and skip Arc installation when disabled; tighten cluster-config enricher completion criteria to require both ServerURL and CA data.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/spec/collector.go | Switch ContainerService SDK import to armcontainerservice/v8. |
| pkg/spec/collector_test.go | Switch ContainerService SDK import to armcontainerservice/v8. |
| pkg/kube/client.go | Switch ContainerService SDK import to armcontainerservice/v8. |
| pkg/components/arc/arc_base.go | Switch ContainerService SDK import to armcontainerservice/v8. |
| pkg/bootstrapper/components.go | Add ensureMachine action resolver + AzureCredential builder; pass enabled into Arc action. |
| pkg/bootstrapper/cluster_config_enricher.go | Switch SDK import to v8; consider step complete only when both ServerURL + CA are present. |
| pkg/bootstrapper/bootstrapper.go | Add (commented) TODO/bootstrap hook for ensure-machine step. |
| go.mod | Bump ContainerService SDK dependency to armcontainerservice/v8. |
| go.sum | Update sums for ContainerService v8 and transitive ARM deps. |
| components/exports.go | Register new aksmachine/v20260301 component via blank import. |
| components/arc/action.proto | Add enabled field to InstallArcSpec. |
| components/arc/action.pb.go | Regenerated code for the new enabled field. |
| components/arc/v20260301/install_arc.go | Skip Arc installation when spec enabled=false. |
| components/arc/v20260301/arc_registration.go | Switch ContainerService SDK import to armcontainerservice/v8. |
| components/arc/v20260301/arc_helpers.go | Switch ContainerService SDK import to armcontainerservice/v8. |
| components/aksmachine/action.proto | New proto defining EnsureMachine action and credential types. |
| components/aksmachine/action.pb.go | Generated Go code for the new aksmachine proto. |
| components/aksmachine/redact.go | Redact SP client secret from EnsureMachine action. |
| components/aksmachine/v20260301/exports.go | Register EnsureMachine action server. |
| components/aksmachine/v20260301/ensure_machine.go | Implement agent-pool + machine ensure logic using ARM clients. |
| components/aksmachine/v20260301/ensure_machine_test.go | Unit tests for helper functions in ensure-machine implementation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func buildARMClientOptions(endpointOverride string) *arm.ClientOptions { | ||
| if endpointOverride == "" { | ||
| return nil | ||
| } | ||
| return &arm.ClientOptions{ | ||
| ClientOptions: azcore.ClientOptions{ | ||
| Cloud: cloud.Configuration{ | ||
| Services: map[cloud.ServiceName]cloud.ServiceConfiguration{ | ||
| cloud.ResourceManager: { | ||
| Endpoint: endpointOverride, | ||
| // No audience needed for local servers that don't validate tokens. | ||
| Audience: endpointOverride, | ||
| }, | ||
| }, | ||
| }, | ||
| InsecureAllowCredentialWithHTTP: true, | ||
| }, | ||
| } |
There was a problem hiding this comment.
buildARMClientOptions enables InsecureAllowCredentialWithHTTP for any non-empty endpointOverride. If this override is ever made configurable (env/flag), it could accidentally allow sending ARM credentials over plaintext HTTP to a non-local endpoint. Consider restricting the override to loopback (localhost/127.0.0.1) and/or guarding this behind a test-only build tag to prevent insecure production use.
There was a problem hiding this comment.
This is for development testing only, will never be made configurable.
| // Redact the service principal client secret carried in the spec. | ||
| if sp := x.GetSpec().GetAzureCredential().GetServicePrincipal(); sp != nil { |
There was a problem hiding this comment.
EnsureMachine.Redact can panic when Spec or AzureCredential is nil (e.g., when using Azure CLI credential fallback). Hub.ApplyAction always calls Redact() on the response action, so this needs to be nil-safe before dereferencing nested fields; only redact the SP secret when all intermediate pointers are non-nil and the service principal oneof is set.
| // Redact the service principal client secret carried in the spec. | |
| if sp := x.GetSpec().GetAzureCredential().GetServicePrincipal(); sp != nil { | |
| // If the receiver or any nested fields are nil, there's nothing to redact. | |
| if x == nil { | |
| return | |
| } | |
| spec := x.GetSpec() | |
| if spec == nil { | |
| return | |
| } | |
| cred := spec.GetAzureCredential() | |
| if cred == nil { | |
| return | |
| } | |
| // Redact the service principal client secret carried in the spec, if present. | |
| if sp := cred.GetServicePrincipal(); sp != nil { |
There was a problem hiding this comment.
Nope it won't panic. GetXXX() has nil pointer protection internally
| spec := action.GetSpec() | ||
|
|
||
| // Skip all Azure operations when drift detection/remediation is disabled. | ||
| if !spec.GetEnabled() { | ||
| a.logger.Info("EnsureMachine: drift detection and remediation is disabled, skipping") | ||
| item, err := anypb.New(action) |
There was a problem hiding this comment.
ApplyAction assumes action.Spec is non-nil (spec := action.GetSpec(); then spec.GetEnabled()/GetSubscriptionId/...); a malformed request without spec will cause a nil-pointer panic and crash the action server. Consider validating spec != nil and returning codes.InvalidArgument when missing.
Introduces a new aksmachine component that registers the local machine with the AKS control plane during bootstrap using the ARM Machine and AgentPool APIs.
ensureMachine will:
aksflexnodespool with mode "Machines" — if it already exists, return early (idempotent)."aks-flex-node"tag and Kubernetes profile (version, max pods, labels, taints, kubelet config) into theaksflexnodespool. If machine already exists, return early. Any update of the machine property should be done through AKS RP API.This action will be added into bootstrap steps once RP side of the change is released, right now it is commented out.
Testing:
