add qa-infra-automation as a provider via interoperability#552
add qa-infra-automation as a provider via interoperability#552slickwarren wants to merge 24 commits intorancher:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a new interoperability provider (interoperability/qainfraautomation) that lets rancher/tests orchestrate infrastructure + downstream cluster setup through a local clone of rancher-qa-infra-automation, and adds a NeuVector “hardened custom cluster” validation suite as a concrete usage example.
Changes:
- Added a new
interoperability/qainfraautomationpackage (config schema, OpenTofu runner, Ansible runner, provisioning orchestration). - Added a NeuVector hardened test suite + accompanying docs/config example under
validation/neuvector. - Updated UI plugin install flow to tolerate watch errors when the app is already deployed; added NeuVector UI extension constant and new chart-install helpers for NeuVector hardened options.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| validation/neuvector/neuvector_hardened_test.go | Adds an example suite that provisions a hardened custom downstream cluster via qainfraautomation and installs NeuVector + UI extension. |
| validation/neuvector/aws_custom_cluster_config.yaml | Adds an example cattle config for AWS-backed custom cluster provisioning via qainfraautomation. |
| validation/neuvector/README.md | Documents prerequisites/config for the NeuVector validation tests using qainfraautomation. |
| interoperability/qainfraautomation/provisioning.go | Implements orchestration for provisioning Rancher/custom/standalone clusters via OpenTofu + Ansible. |
| interoperability/qainfraautomation/tofu/tofu.go | Adds a thin OpenTofu CLI wrapper used by the provisioning layer. |
| interoperability/qainfraautomation/ansible/ansible.go | Adds an Ansible CLI wrapper (inventory rendering, ssh-add, playbook execution). |
| interoperability/qainfraautomation/config/config.go | Defines the qainfraautomation config model loaded from cattle config. |
| interoperability/qainfraautomation/README.md | Documents qainfraautomation configuration and supported flows. |
| interoperability/charts/constants.go | Adds a NeuVector UI extension name constant for interoperability tests. |
| actions/uiplugins/uiplugins.go | Adds fallback verification when the watch for UI plugin install ends with an error. |
| actions/charts/neuvector_new.go | Adds NeuVector install helpers with hardened/K3s value overrides and chart-state polling. |
| actions/charts/charts.go | Extends chart payload options with K3s and Hardened flags (used by NeuVector hardened flow). |
| interoperability/go.mod | Updates the interoperability module dependency set. |
| interoperability/go.sum | Updates the interoperability module checksum set. |
| func (c *Client) logOutputs() { | ||
| out, err := c.run("tofu", []string{"output", "-json"}, c.moduleDir, nil) | ||
| if err != nil { | ||
| logrus.Warnf("[tofu] could not retrieve outputs after apply in %s: %v", c.moduleDir, err) | ||
| return | ||
| } | ||
| var outputs map[string]struct { | ||
| Value any `json:"value"` | ||
| } | ||
| if err := json.Unmarshal(out, &outputs); err != nil { | ||
| logrus.Warnf("[tofu] could not parse outputs JSON after apply in %s: %v", c.moduleDir, err) | ||
| return | ||
| } | ||
| if len(outputs) == 0 { | ||
| logrus.Infof("[tofu] apply outputs: (none) in %s (workspace=%s)", c.moduleDir, c.workspace) | ||
| return | ||
| } | ||
| for k, v := range outputs { | ||
| logrus.Infof("[tofu] apply output %q = %v (workspace=%s)", k, v.Value, c.workspace) | ||
| } | ||
| } |
There was a problem hiding this comment.
logOutputs() logs every OpenTofu output value at INFO level. Tofu outputs frequently contain sensitive material (kubeconfigs, tokens, keys), so this can leak secrets into CI logs. Prefer logging only output names, or parse the sensitive flag from tofu output -json and redact/skip sensitive values (and/or lower to DEBUG with redaction).
There was a problem hiding this comment.
I don't think this is important to add on this PR, but if it becomes an issue we can address it later.
There was a problem hiding this comment.
Agreed. We kind of need an entire logging overhaul AND security overhaul, but neither are too important to dedicate our limited time and resources to at the moment.
a06a31c to
86ebff3
Compare
hamistao
left a comment
There was a problem hiding this comment.
Addressing the overall design, I think if we can avoid the need for cloning rancher/qa-infra-automation, we should. I really like your suggestion of adding it to go.mod, and perhaps instead of doing releases we can have automated pull requests to ensure we are constantly using HEAD on the main branch, similarly to what we do with shepherd.
Also, I do think it would be simpler to review having the neuvector example in a separate PR as you said, but I am not requiring you to move that, feel free to leave it here you wish.
Lastly, I know this is not ready for a full review, but I left some comments that hopefully will de helpful.
If you want to talk more about the design we can schedule a meeting some time today to go over it more.
8008e8a to
3946043
Compare
hamistao
left a comment
There was a problem hiding this comment.
I received a request review so dropped in some more comments. But during the review I saw some comments from the previous one are unresolved so I guess I came too early.
| // RunPlaybook executes ansible-playbook with the given playbook and inventory paths, streaming output via logrus. | ||
| func (c *Client) RunPlaybook(playbookPath, inventoryPath string, extraEnv []string) error { | ||
| absPlaybook := filepath.Join(c.repoPath, playbookPath) | ||
| absInventory := filepath.Join(c.repoPath, inventoryPath) |
There was a problem hiding this comment.
What do you think of embedding the inventory in the client object?
floatingman
left a comment
There was a problem hiding this comment.
Just a few comments. There seems to be inconsistent use of context.TODO() and context.Background() throughout the code. I'd prefer context.Background() if the context is not just a placeholder.
| const ( | ||
| uiPluginChartsRepoName = "rancher-ui-plugins" | ||
| uiPluginChartsURL = "https://github.com/rancher/ui-plugin-charts" | ||
| uiPluginChartsBranch = "main" |
There was a problem hiding this comment.
Why is this hardcoded to main? Is it possible to make this a variable so other versions can be tested?
There was a problem hiding this comment.
yes, but I think that should be in a separate PR. I think in that case we would want a dedicated test for the UI extension, since its not really being tested here but only used in setup.
bb95097 to
b154995
Compare
de164cd to
fd9e5d1
Compare
fd9e5d1 to
3e0672a
Compare
This PR connects rancher/tests -> rancher/qa-infra-automation by adding it as a client to rancher/tests/interoperability.
Feedback on design is welcome. Overall, it looks like this:
in principle, this should only be used as a 'setup' option, when a test needs access to a downstream cluster to do some tests.
leaving the neuvector test in for now to provide an example, but for complexity's sake I can put this in a separate PR upon request.