Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions docs/user/cli-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1589,13 +1589,24 @@ helm package ./bundle -d /tmp/
helm push /tmp/aicr-bundle-*.tgz oci://<your-registry>/<path>

# 3. Install — the URL is supplied here, not at bundle time
# `--set repoURL` is the PARENT NAMESPACE (no trailing chart name).
# The parent Application appends `.Chart.Name` via its `source.chart`
# field, and path-based children append it directly into their
# rendered `source.repoURL`. Including the chart name in --set
# repoURL double-appends it and the children fail to resolve.
helm install aicr-bundle oci://<your-registry>/<path>/aicr-bundle --version <chart-version> \
-n argocd \
--set repoURL=oci://<your-registry>/<path>/aicr-bundle \
--set repoURL=oci://<your-registry>/<path> \
--set targetRevision=<chart-version>
```

The chart's `templates/aicr-stack.yaml` renders the parent Argo Application with `.Values.repoURL` and `.Values.targetRevision` substituted in. The parent Application then triggers Argo to render the chart again from the OCI source, creating the per-component child Applications with sync-wave ordering preserved. Child Applications whose source is path-based (manifest-only and mixed-component `-post` folders) inherit `.Values.repoURL` so they too pull from the same published location.
The chart's `templates/aicr-stack.yaml` renders the parent Argo Application with `.Values.repoURL` and `.Values.targetRevision` substituted in. The parent Application then triggers Argo to render the chart again from the OCI source, creating the per-component child Applications with sync-wave ordering preserved. Child Applications whose source is path-based (manifest-only and mixed-component `-pre` / `-post` folders) inherit `.Values.repoURL` and append `.Chart.Name` so they pull from the same published artifact as the parent.

**Argo CD OCI prerequisites.** Path-based child Applications use Argo CD's generic OCI artifact source type (introduced in Argo CD v2.13). The argocd-helm bundle therefore requires:
- Argo CD **≥ v2.13** on the target cluster.
- A registry that serves Helm-pushed OCI artifacts through the generic OCI manifest fetch path (most modern registries — ECR, GHCR, GAR, Harbor, Artifactory, plain `oras`-compatible registries — support this).

If the recipe is pure-Helm (no manifest-only / mixed components), path-based children are not exercised and the bundle can work on Argo CD versions older than v2.13. If path-based children are present, Argo CD v2.13+ is required. See the troubleshooting section below if `Failed to load target state` appears on `aicr-stack` or any `<component>-pre` / `<component>-post` Application.

**`helm install ./bundle` from a local directory** *also* works, but with a caveat: child Applications whose source is path-based require Argo's repo-server to fetch the bundle from a remote (git or OCI) — there is no local-filesystem source type for an Argo Application. Local `helm install` is therefore end-to-end only when the recipe contains pure-Helm components. For everything else, publish first.

Expand Down Expand Up @@ -2428,6 +2439,12 @@ aicr --debug bundle -r recipe.yaml

**"failed to load manifest \<path\> for component \<name\>"** — the recipe references a manifest path that does not exist in the current AICR binary's embedded data. This usually means the recipe was generated by an older binary and a referenced manifest has since been removed or relocated. Regenerate the recipe with the current binary (`aicr recipe ...`) and re-bundle. AICR recipes are a point-in-time artifact of the binary that produced them; bundling a stale recipe against a newer binary is not supported.

**`--deployer argocd-helm`: `aicr-stack` or `<component>-pre` / `<component>-post` Application stuck at `Unknown` sync status / "Failed to load target state: ... `<registry>/<path>:<tag>: not found`"** — Argo CD cannot resolve the OCI artifact the parent or path-based child Application points at. Three common causes, in order of likelihood:

1. **Chart name doubled in `--set repoURL`.** Under the current contract, `--set repoURL` carries the **parent namespace only** (e.g., `oci://ghcr.io/myorg`). The parent Application appends `.Chart.Name` via its `source.chart` field, and path-based children append it directly into their rendered `source.repoURL`. Passing `--set repoURL=oci://ghcr.io/myorg/aicr-bundle` produces a double-suffixed reference (`.../aicr-bundle/aicr-bundle:<tag>`) that does not exist. Drop the trailing chart segment.
2. **Argo CD older than v2.13.** Path-based children rely on Argo CD's generic OCI artifact source type, added in v2.13. Older Argo treats the source as Git and fails to resolve. Check with `kubectl -n argocd get deploy argocd-repo-server -o jsonpath='{.spec.template.spec.containers[0].image}'`. Upgrade Argo, or use `--deployer helm` if Argo upgrade is not an option.
3. **Tag missing from the registry.** Verify the published artifact exists at the exact tag the parent expects: `oras manifest fetch <registry>/<path>/<chart>:<tag>`. If `aicr bundle` is invoked without a tag (`oci://<registry>/<path>/<chart>` with no `:<tag>` suffix), the CLI version is used as the default — make sure `--set targetRevision=<chart-version>` at install time matches.

## External Data Directory

The `--data` flag enables extending or overriding the embedded recipe data with external files. This allows customization without rebuilding the CLI.
Expand Down
8 changes: 6 additions & 2 deletions pkg/bundler/deployer/argocdhelm/argocdhelm.go
Original file line number Diff line number Diff line change
Expand Up @@ -481,10 +481,14 @@ spec:
// helm package ./bundle -d /tmp/
// helm push /tmp/aicr-bundle-*.tgz oci://ghcr.io/myorg
//
// # install — same URL passed once, used for parent + children
// # install — --set repoURL is the PARENT NAMESPACE (no chart name).
// # The parent App appends .Chart.Name via source.chart, and path-based
// # children append it directly into their rendered source.repoURL —
// # including the chart name in --set repoURL double-suffixes both and
// # the children fail to resolve. See issues #1018 / #1034.
// helm install aicr-bundle oci://ghcr.io/myorg/aicr-bundle --version <tag> \
// -n argocd \
// --set repoURL=oci://ghcr.io/myorg/aicr-bundle \
// --set repoURL=oci://ghcr.io/myorg \
// --set targetRevision=<tag>
//
// The user could also `kubectl apply` the rendered parent App directly
Expand Down
137 changes: 137 additions & 0 deletions pkg/bundler/deployer/argocdhelm/argocdhelm_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1369,6 +1369,143 @@ func TestGenerate_AppNameValidatedAtBoundary(t *testing.T) {
}
}

// TestHelmTemplate_MixedComponentPreChildResolvesFromOCI is the live-render
// regression test for issue #1018: a recipe with a Helm component AND
// ComponentPreManifests for that same component (the gke-cos / EKS GB200
// shape from the bug report) produces a synthetic NNN-<name>-pre folder
// rendered into a path-based child Application. Before the post-#1035
// contract, the path-based child's `source.repoURL` did not append
// `.Chart.Name`, so when the user `--set repoURL=oci://<registry>/<org>`
// (the parent namespace) Argo CD's generic OCI source resolved
// `<registry>/<org>:<tag>` — an artifact that does not exist — and the
// child stuck at `Unknown / Failed to load target state`.
//
// This test exercises the full chain: generator → helm template render
// → rendered child Application asserts. It pins both the parent and the
// `-pre` child to the same expected `<parent-namespace>/<chart-name>:
// <tag>` OCI artifact reference. The sibling upstream-helm child
// (`gpu-operator`) is also asserted to confirm its `repoURL` stays as
// the upstream Helm chart registry (not templated from .Values.repoURL).
func TestHelmTemplate_MixedComponentPreChildResolvesFromOCI(t *testing.T) {
if _, err := exec.LookPath("helm"); err != nil {
t.Skip("helm not available; skipping live-render test")
}

outputDir := t.TempDir()
rr := newRecipeResult("v1.0.0", []recipe.ComponentRef{
{
Name: "gpu-operator", Namespace: "gpu-operator", Chart: "gpu-operator",
Version: "v25.3.3", Type: recipe.ComponentTypeHelm,
Source: "https://helm.ngc.nvidia.com/nvidia",
},
})
rr.DeploymentOrder = []string{"gpu-operator"}

g := &Generator{
RecipeResult: rr,
ComponentValues: map[string]map[string]any{
"gpu-operator": {"driver": map[string]any{"version": "580"}},
},
Version: "v0.0.0-test",
// Pre-manifest in the gke-cos / EKS GB200 shape from #1018 — a
// Namespace primer that must apply before the gpu-operator chart.
ComponentPreManifests: map[string]map[string][]byte{
"gpu-operator": {
"components/gpu-operator/manifests/kernel-module-params.yaml": []byte(
"apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: nvidia-kernel-module-params\n namespace: gpu-operator\n",
),
},
},
}
if _, err := g.Generate(context.Background(), outputDir); err != nil {
t.Fatalf("Generate() error = %v", err)
}

const setRepoURL = "oci://example.test/myorg"
const wantPathChildRepoURL = setRepoURL + "/" + DefaultChartName // post-#1035 fix
const wantTagName = "v9.9.9-issue-1018"

cmdCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
cmd := exec.CommandContext(cmdCtx, "helm", "template", "test-release", outputDir, //nolint:gosec // controlled args
"--set", "repoURL="+setRepoURL,
"--set", "targetRevision="+wantTagName,
)
out, err := cmd.CombinedOutput()
if err != nil {
t.Fatalf("helm template failed: %v\noutput:\n%s", err, out)
}

dec := yaml.NewDecoder(strings.NewReader(string(out)))
type appLite struct {
Kind string `yaml:"kind"`
Metadata struct{ Name string } `yaml:"metadata"`
Spec struct {
Source struct {
RepoURL string `yaml:"repoURL"`
Chart string `yaml:"chart"`
TargetRevision string `yaml:"targetRevision"`
Path string `yaml:"path"`
} `yaml:"source"`
} `yaml:"spec"`
}
found := map[string]appLite{}
for {
var a appLite
decErr := dec.Decode(&a)
if errors.Is(decErr, io.EOF) {
break
}
if decErr != nil {
t.Fatalf("failed to decode rendered YAML: %v\noutput:\n%s", decErr, out)
}
if a.Kind == "Application" && a.Metadata.Name != "" {
found[a.Metadata.Name] = a
}
}

// Parent: uses Helm chart shape (source.chart = .Chart.Name).
parent, ok := found["aicr-stack"]
if !ok {
t.Fatalf("rendered output missing parent Application 'aicr-stack'\noutput:\n%s", out)
}
if parent.Spec.Source.RepoURL != setRepoURL {
t.Errorf("parent App repoURL: got %q, want %q (parent namespace, no chart name)", parent.Spec.Source.RepoURL, setRepoURL)
}
if parent.Spec.Source.Chart != DefaultChartName {
t.Errorf("parent App chart: got %q, want %q", parent.Spec.Source.Chart, DefaultChartName)
}

// The path-based -pre child is the #1018 regression target. Argo CD's
// generic OCI source treats `repoURL` as the full artifact, so the
// rendered value MUST equal `<parent-namespace>/<chart-name>` for the
// `<reg>/<org>/<chart>:<tag>` lookup to resolve.
preChild, ok := found["gpu-operator-pre"]
if !ok {
t.Fatalf("rendered output missing path-based child 'gpu-operator-pre'\noutput:\n%s", out)
}
if preChild.Spec.Source.RepoURL != wantPathChildRepoURL {
t.Errorf("gpu-operator-pre child repoURL: got %q, want %q (issue #1018: must be parent-namespace + chart-name so the generic OCI source resolves to the published artifact)", preChild.Spec.Source.RepoURL, wantPathChildRepoURL)
}
if preChild.Spec.Source.Path != "001-gpu-operator-pre" {
t.Errorf("gpu-operator-pre path: got %q, want %q (NNN-folder name is structural; must not be templated)", preChild.Spec.Source.Path, "001-gpu-operator-pre")
}
if preChild.Spec.Source.TargetRevision != wantTagName {
t.Errorf("gpu-operator-pre child targetRevision: got %q, want %q", preChild.Spec.Source.TargetRevision, wantTagName)
}

// The sibling upstream-helm child must keep the upstream chart
// registry — its source.repoURL is NOT templated from .Values.repoURL.
// Guards against accidentally widening the path-based template change.
upstream, ok := found["gpu-operator"]
if !ok {
t.Fatalf("rendered output missing upstream-helm child 'gpu-operator'\noutput:\n%s", out)
}
if upstream.Spec.Source.RepoURL != "https://helm.ngc.nvidia.com/nvidia" {
t.Errorf("gpu-operator upstream-helm child repoURL: got %q, want upstream Helm registry (must not be templated from .Values.repoURL)", upstream.Spec.Source.RepoURL)
}
}

// TestHelmTemplate_FailsWithoutRepoURL verifies the `required` directive
// in the parent App template fires when the user omits --set repoURL. This
// is the safety net that prevents users from accidentally publishing a
Expand Down
Loading