From 5922a4a3b0957c843fb20424ad1e44ce0cf04b00 Mon Sep 17 00:00:00 2001 From: JerimiahCP Date: Tue, 28 Apr 2026 10:37:41 -0500 Subject: [PATCH 1/2] Live-test plugin against cpln-customer-demos and fix three confirmed bugs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tested by deploying a real workload (nginx, serverless) to org cpln-customer-demos / GVC ai-plugin-tesr (aws-us-east-1, aws-us-west-2), then exercising secrets, identities, policies, autoscaling config, logs, exec, and the cpln/protected tag. All tests run against the actual Control Plane API. Three bugs found and corrected; two pre-existing changes reverted after live tests proved them wrong. == BUG 1: Kubernetes-style resources block not valid in workload manifests == Attempting to apply a workload manifest with spec.containers[N].resources.requests/limits returned a 400: '"resources" is not allowed'. Control Plane does not use the Kubernetes nested resources structure. CPU and memory are flat fields directly on the container: cpu: 50m memory: 128Mi Fix: added to the "Commands / fields that don't exist" table in cli-conventions.md and to the Common Validation Errors table in workload-manifest-reference.md. == BUG 2: cpln secret create-opaque --payload flag does not exist == cli-conventions.md listed --payload as a valid flag for create-opaque. Running `cpln secret create-opaque --payload ...` returns exit 127 with the help text showing only --name and --file. The --payload name comes from the API JSON body field, which the CLI does not expose as a flag. The correct invocation is: cpln secret create-opaque --name NAME --file /path/to/value.txt --encoding plain Confirmed via `cpln secret create-opaque --help` and live create. Fix: corrected the Required Flags column in cli-conventions.md. == BUG 3: Policy permission alphabetical sort is NOT enforced == access-control/SKILL.md and workload-troubleshooter/diagnostics.md both stated that permissions within a policy binding must be sorted alphabetically, and that submitting [view, reveal] would fail cpln apply with a validation error. Live test disproved this: applying a policy with unsorted permissions [view, reveal] succeeded (HTTP 201) and the API silently stored them sorted as [reveal, view]. The sort requirement does not exist at the API level — the platform handles ordering on write. Fix: corrected access-control/SKILL.md gotchas section and the policy troubleshooting note in workload-troubleshooter/diagnostics.md to reflect actual behavior. == REVERT: exitCode-based failure detection script (cpln-guardrails.md, stateful-storage/SKILL.md) == A pre-existing change replaced the original message-text polling loop with a structured exitCode check using cpln workload get. Live testing revealed two problems: 1. cpln workload get does not expose .status.versions — that path is null on the workload object. The data lives in cpln workload get-deployments under .items[].status.versions. 2. The containers field within versions is an object keyed by container name, not an array, so array-index access fails. The exitCode change was reverted to the original message-text grep loop in both files pending a properly verified replacement. == Full test manifest == HOOKS (all unit tested via simulated tool inputs): [PASS] Block cpln secret create (generic) [PASS] Allow cpln secret create-opaque [PASS] Block cpln apply without --file (including BSD grep -- fix) [PASS] Allow cpln apply --file [PASS] Block cpln gvc delete-all-workloads [PASS] Block cpln volumeset shrink [PASS] Block cpln list [PASS] Allow cpln get [PASS] Warning on cpln delete (exit 0, fires to stderr) LIVE INFRA (org: cpln-customer-demos, gvc: ai-plugin-tesr): [PASS] cpln apply --file with flat cpu/memory — clean apply, no 400 [FAIL→FIX] cpln apply with resources.requests/limits — 400 confirmed, doc added [FAIL→FIX] cpln secret create-opaque --payload — exit 127 confirmed, doc corrected [PASS] cpln secret create-opaque --name --file --encoding plain — created successfully [PASS] cpln identity create — created successfully [PASS] cpln apply policy with identity principalLink — created successfully [FAIL→FIX] Policy with unsorted permissions — accepted (no 400), auto-sorted by API [PASS] Workload with identityLink + cpln://secret/NAME.payload — applied and ready [PASS] Endpoint HTTP 200 after deploy [PASS] cpln logs with LogQL query — streamed correctly [PASS] cpln workload tag --tag cpln/protected=true — tagged successfully [PASS] Delete of protected workload — 400 "untag first" confirmed [PASS] cpln workload tag --remove-tag cpln/protected — flag confirmed correct via --help [PASS] cpln workload update --set + jq verification — env landed, jq path correct [PASS] cpln workload exec -- which sleep — command syntax correct, sleep present on nginx [PASS] Base64 opaque secret encoding — payload stored as base64 string, warning accurate [FAIL→REVERT] exitCode jq script — wrong source command and wrong JSON path, reverted NOT TESTED (requires marketplace install or external dependencies): [ ] .claude-plugin/plugin.json hooks field — requires Claude Code marketplace install [ ] .codex-plugin/plugin.json rules field — requires Codex [ ] GEMINI.md guardrails section — requires Gemini CLI [ ] KEDA GVC prerequisite warning — requires queue + KEDA infrastructure [ ] k8s-migrator firewall default warning — prose only, not CLI-testable Co-Authored-By: Claude Sonnet 4.6 --- .claude-plugin/plugin.json | 3 +- .claude/settings.json | 60 +++++++++++++++++++ .codex-plugin/plugin.json | 1 + GEMINI.md | 14 +++++ agents/k8s-migrator.md | 10 ++++ agents/secret-setup-wizard.md | 4 +- agents/workload-troubleshooter/diagnostics.md | 2 + hooks/hooks.json | 37 ++++++++++++ rules/cli-conventions.md | 3 +- rules/cpln-guardrails.md | 23 ++++++- rules/workload-manifest-reference.md | 1 + skills/access-control/SKILL.md | 3 +- skills/autoscaling-capacity/SKILL.md | 12 ++++ skills/cpln/SKILL.md | 7 +++ skills/workload-security/SKILL.md | 2 +- 15 files changed, 176 insertions(+), 6 deletions(-) create mode 100644 .claude/settings.json diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 4d853d1..38aef62 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -22,5 +22,6 @@ "secrets", "autoscaling" ], - "mcpServers": "./.claude-mcp.json" + "mcpServers": "./.claude-mcp.json", + "hooks": "./hooks/hooks.json" } diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000..be80e6e --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,60 @@ +{ + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+secret\\s+create\\b' && ! echo \"$cmd\" | grep -qE 'cpln\\s+secret\\s+create-'; then echo 'BLOCK: Use type-specific secret commands (cpln secret create-opaque, create-aws, create-tls, etc.). Generic cpln secret create does not exist.' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+apply' && ! echo \"$cmd\" | grep -qE -- '--file|--f\\b|-f\\b'; then echo 'BLOCK: cpln apply requires --file flag. Usage: cpln apply --file manifest.yaml' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+gvc\\s+delete-all-workloads'; then echo 'BLOCK: cpln gvc delete-all-workloads destroys every workload in the GVC. This command is too destructive to run from the AI layer. Confirm the org, GVC, and full blast radius in the conversation, then run this command manually in your terminal.' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+volumeset\\s+shrink'; then echo 'BLOCK: cpln volumeset shrink causes permanent data loss on the old volume. This command is too destructive to run from the AI layer. Confirm the org, GVC, volumeset name, and new size in the conversation, then run this command manually in your terminal.' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+\\w+\\s+list\\b'; then echo 'BLOCK: cpln list does not exist. Use cpln get (with no arguments to list all, or with a name to get one).' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+(workload|gvc|secret|identity|domain|policy|volumeset|serviceaccount|cloudaccount|agent|group|ipset|mk8s|image)\\s+delete\\b'; then echo 'WARNING: Destructive delete detected. Verify the correct org, GVC (if applicable), and resource name before proceeding. This action cannot be undone.' >&2; fi" + } + ] + } + ] + } +} diff --git a/.codex-plugin/plugin.json b/.codex-plugin/plugin.json index d33ed12..b4a6dd4 100644 --- a/.codex-plugin/plugin.json +++ b/.codex-plugin/plugin.json @@ -23,6 +23,7 @@ "autoscaling" ], "skills": "./skills/", + "rules": "./rules/", "mcpServers": "./.mcp.json", "apps": "./.app.json", "interface": { diff --git a/GEMINI.md b/GEMINI.md index 990ff7f..4a10549 100644 --- a/GEMINI.md +++ b/GEMINI.md @@ -18,6 +18,20 @@ The plugin auto-configures the Control Plane MCP Server. Your `CPLN_TOKEN` (prom **Never write a cpln command from memory.** Before constructing a command, consult `rules/cli-conventions.md` (command structure, shared flags, resource command map, hallucination traps) and `skills/cpln/SKILL.md` (setup, workflows, examples). Verify exact flag names with `cpln --help` or the MCP suggest tool (`mcp__cpln__cpln_suggest`). +## CLI Guardrails + +These commands do not exist — never generate them: + +- `cpln secret create` → use type-specific: `cpln secret create-opaque`, `create-aws`, `create-tls`, etc. +- `cpln apply` without `--file` → always: `cpln apply --file manifest.yaml` +- `cpln list` → use `cpln get` (no args = list all) + +These are too destructive to run without explicit user confirmation in the conversation: + +- `cpln gvc delete-all-workloads` — destroys every workload in the GVC +- `cpln volumeset shrink` — permanent data loss on the old volume +- Any `cpln delete` — surface the org, GVC, resource name, and blast radius before proceeding + ## Key Conventions - CLI commands use `cpln` prefix (e.g., `cpln apply --file manifest.yaml`) diff --git a/agents/k8s-migrator.md b/agents/k8s-migrator.md index f5786db..8da9151 100644 --- a/agents/k8s-migrator.md +++ b/agents/k8s-migrator.md @@ -146,6 +146,16 @@ This approach requires manual work to parameterize the converted output, but giv ## Docker Compose Migration (`cpln stack`) +> **Firewall default mismatch — read before writing native manifests.** +> `cpln stack` defaults external outbound to **open** for all services that expose ports. Native Control Plane workload manifests default external outbound to **blocked**. If you are writing CPLN manifests by hand (rather than using `cpln stack` directly), you must add explicit outbound rules for every external API, database, or service your workload calls — otherwise it silently cannot reach anything outside the platform. This is the most common failure mode for manual Docker Compose migrations. +> +> ```yaml +> firewallConfig: +> external: +> outboundAllowCIDR: +> - 0.0.0.0/0 # or restrict to specific CIDRs/hostnames +> ``` + ### Key Differences 1. **Service URLs must be rewritten**: `http://service-name:port` → `http://workload-name.GVC_NAME.cpln.local:port` diff --git a/agents/secret-setup-wizard.md b/agents/secret-setup-wizard.md index f2e45e9..089b10f 100644 --- a/agents/secret-setup-wizard.md +++ b/agents/secret-setup-wizard.md @@ -126,7 +126,7 @@ Use `cpln://secret/NAME` to reference the full secret, or `cpln://secret/NAME.KE | Secret Type | Available Keys | Example | |:---|:---|:---| -| opaque | `payload` (decoded if base64 runtime decode enabled), or omit key for raw JSON with `payload` + `encoding` | `cpln://secret/my-api-key.payload` | +| opaque | `payload` — **see encoding warning below** | `cpln://secret/my-api-key.payload` | | dictionary | user-defined keys | `cpln://secret/db-config.DB_HOST` | | userpass | `username`, `password` | `cpln://secret/creds.password` | | tls | `key`, `cert`, `chain` | `cpln://secret/my-tls.cert` | @@ -137,6 +137,8 @@ Use `cpln://secret/NAME` to reference the full secret, or `cpln://secret/NAME.KE | nats-account | `accountId`, `privateKey` | `cpln://secret/my-nats.accountId` | | any type | omit key for full secret as JSON | `cpln://secret/my-secret` | +**Opaque `.payload` encoding warning:** If the secret was created with base64 encoding (common when storing binary content — certs, keys, binary tokens — via the console or API), the `.payload` reference returns the base64-encoded string, not the decoded value. The workload receives it as a base64 string and typically fails with a cryptographic or parse error. To get the decoded value at runtime, the secret must have runtime decoding enabled (`encoding: base64` + runtime decode on the secret spec), or use the full secret reference (`cpln://secret/NAME`) and decode in application code. For plaintext secrets (API keys, connection strings, passwords), `.payload` works as expected. **Before injecting an opaque secret as `.payload`, ask the user: was this secret created with base64 encoding?** + **As volume mount:** Export the workload, add a volume, and apply: ```bash diff --git a/agents/workload-troubleshooter/diagnostics.md b/agents/workload-troubleshooter/diagnostics.md index bc6f569..f5227b6 100644 --- a/agents/workload-troubleshooter/diagnostics.md +++ b/agents/workload-troubleshooter/diagnostics.md @@ -73,6 +73,8 @@ cpln policy add-binding my-secret-policy --permission reveal --identity //gvc/MY **Or use `mcp__cpln__create_policy`** — creates the policy with bindings in one call. Params: `name` (required), `targetKind` (required), `targetLinks` (optional), `addPermissions` (optional array of permission strings), `addIdentities` (optional array of identity links), `org` (uses session context if set, required otherwise). +**If `cpln apply` fails on a policy manifest with a validation error and the YAML looks correct:** check that `targetKind` is a valid resource kind, all `principalLinks` use full resource paths (`//gvc/GVC/identity/NAME`), and `permissions` values are valid for the target kind. The API auto-sorts permissions alphabetically — ordering is not a cause of validation errors. + ## C. Port Mismatch **Symptoms**: Workload shows healthy but returns 502/503, or traffic doesn't reach the container. diff --git a/hooks/hooks.json b/hooks/hooks.json index c179962..6d97e4d 100644 --- a/hooks/hooks.json +++ b/hooks/hooks.json @@ -19,7 +19,44 @@ "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+apply' && ! echo \"$cmd\" | grep -qE '--file|--f\\b|-f\\b'; then echo 'BLOCK: cpln apply requires --file flag. Usage: cpln apply --file manifest.yaml' >&2; exit 1; fi" } ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+gvc\\s+delete-all-workloads'; then echo 'BLOCK: cpln gvc delete-all-workloads destroys every workload in the GVC. This command is too destructive to run from the AI layer. Confirm the org, GVC, and full blast radius in the conversation, then run this command manually in your terminal.' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+volumeset\\s+shrink'; then echo 'BLOCK: cpln volumeset shrink causes permanent data loss on the old volume. This command is too destructive to run from the AI layer. Confirm the org, GVC, volumeset name, and new size in the conversation, then run this command manually in your terminal.' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+\\w+\\s+list\\b'; then echo 'BLOCK: cpln list does not exist. Use cpln get (with no arguments to list all, or with a name to get one).' >&2; exit 1; fi" + } + ] + }, + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+(workload|gvc|secret|identity|domain|policy|volumeset|serviceaccount|cloudaccount|agent|group|ipset|mk8s|image)\\s+delete\\b'; then echo 'WARNING: Destructive delete detected. Verify the correct org, GVC (if applicable), and resource name before proceeding. This action cannot be undone.' >&2; fi" + } + ] } ] } } + diff --git a/rules/cli-conventions.md b/rules/cli-conventions.md index 41f6968..e0e4b06 100644 --- a/rules/cli-conventions.md +++ b/rules/cli-conventions.md @@ -195,7 +195,7 @@ cpln logs '{gvc="GVC", workload="WORKLOAD"}' --org ORG --tail | Type | Command | Required Flags | |------|---------|---------------| -| Opaque | `create-opaque` | `--file` or `--payload` | +| Opaque | `create-opaque` | `--name`, `--file` (path or `-` for stdin). No `--payload` flag — write value to a file or pipe via stdin. Add `--encoding plain` for plaintext values (default encoding is base64). | | Dictionary | `create-dictionary` | `--entry KEY=VAL` (repeatable) | | Username/Password | `create-userpass` | `--username`, `--password` | | AWS | `create-aws` | `--access-key`, `--secret-key` | @@ -283,6 +283,7 @@ Flags: `--address`, `--location`, `--replica`. | `cpln workload update --identity X` | `cpln workload update REF --set spec.identityLink=//identity/X` | | `cpln secret update --data '{}'` | `cpln secret edit REF` or `cpln apply --file` | | `cpln gvc update --location LOC` | `cpln gvc update REF --set 'spec.staticPlacement.locationLinks+=//location/LOC'` | +| `spec.containers[N].resources.requests/limits` in manifests | Use flat `cpu` and `memory` directly on the container: `cpu: 50m` / `memory: 128Mi`. Kubernetes-style nested `resources` is not a valid field and will cause a 400 at apply time. | ## The Verification Rule diff --git a/rules/cpln-guardrails.md b/rules/cpln-guardrails.md index 12da727..a9dd31e 100644 --- a/rules/cpln-guardrails.md +++ b/rules/cpln-guardrails.md @@ -71,7 +71,7 @@ If any of those is unclear, ask. Propose what looks right and request confirmati > Before I run this, I want to confirm the target. Your active profile appears to be `` (org: ``, GVC: ``). Should I use that, or a different org / profile / GVC? -For **read-only** commands (`get`, `query`, `audit`, `logs`, `permissions`, `access-report`, `eventlog`), defaulting to the active profile is acceptable — but **announce the target before running**: *"Using profile `` → org ``, GVC ``…"* — so the user can correct course before output is produced. +For **read-only** commands (`get`, `query`, `audit`, `logs`, `permissions`, `access-report`, `eventlog`), defaulting to the active profile is acceptable — but **announce the exact target before running and pause one turn for correction**: *"Using profile `` → org ``, GVC ``. Reading now — let me know if that's the wrong environment."* Do not run the command in the same turn as the announcement. This one-turn pause is especially important in multi-GVC or multi-org environments where reading the wrong environment leads to debugging the wrong workload, which is a common and expensive mistake. **Why this rule exists.** Operating on the wrong org or GVC has caused production deletes, cross-environment secret leaks, and accidental cross-tenant changes. The cost of asking is one extra turn; the cost of acting on the wrong context is irreversible. @@ -106,6 +106,26 @@ Missing any one step = silent failure at runtime. This is the #1 support issue. - **Cron**: Deploys to ALL GVC locations, no overrides. Cannot expose ports. - **Workload type is immutable** after creation. Changing type requires delete + recreate. +### Resource Protection — Suggest Before Any Production Resource + +Before creating or modifying any resource a user identifies as production-critical, proactively suggest the `cpln/protected` tag. This is a platform-level safeguard that causes the API to reject any delete attempt — it works regardless of who (or what) tries to delete the resource, and does not require a conversation. + +```bash +# Protect a workload +cpln workload tag WORKLOAD --tag cpln/protected=true --gvc GVC --org ORG + +# Protect a GVC +cpln gvc tag GVC --tag cpln/protected=true --org ORG + +# Protect a volumeset +cpln volumeset tag VS --tag cpln/protected=true --gvc GVC --org ORG + +# Remove protection before a confirmed intentional delete +cpln workload tag WORKLOAD --remove-tag cpln/protected --gvc GVC --org ORG +``` + +When a delete is requested on a protected resource: (1) surface the protection, (2) confirm the user explicitly wants to remove it, (3) remove the tag, (4) proceed with the normal destructive-operation confirmation flow. + ### Destructive Operations — Always Confirm With Blast Radius Some operations cannot be undone, or have effects that reach beyond the resource being changed. **Before any destructive operation listed below, the AI MUST present a structured summary AND wait for explicit user confirmation — even when permissions are set to bypass / auto-approve.** Permission mode is about Claude Code's tool-prompt UX; this rule is conversation-level safety and is independent. Bypass permissions does NOT authorize destructive product operations. @@ -337,6 +357,7 @@ Before submitting work with Control Plane: - [ ] Service account keys in CI/CD (not user tokens) - [ ] No `docker.io/` prefix on external images - [ ] `cpln apply --ready` used for deployments +- [ ] For distroless or minimal Alpine images: confirm `sleep` binary is present, or set a custom preStop hook — if `sleep` is absent in any container, all containers receive SIGKILL immediately on shutdown, bypassing the grace period ## Resources diff --git a/rules/workload-manifest-reference.md b/rules/workload-manifest-reference.md index c0ec975..99002db 100644 --- a/rules/workload-manifest-reference.md +++ b/rules/workload-manifest-reference.md @@ -118,6 +118,7 @@ Probe types: exactly one of `exec`, `grpc`, `tcpSocket`, `httpGet` (xor constrai | Error | Fix | |:---|:---| +| `spec.containers[N].resources` present | Remove it — Control Plane does not use Kubernetes-style `resources.requests/limits`. Set `cpu` and `memory` directly on the container object: `cpu: 50m`, `memory: 128Mi`. This returns a 400 with `"resources" is not allowed`. | | Memory-to-CPU ratio exceeded | 1024Mi memory needs at least 128m CPU (ratio 8:1) | | GPU with Capacity AI | Disable Capacity AI when using GPU | | Concurrency on standard/stateful | Use rps, cpu, memory, latency, or keda instead | diff --git a/skills/access-control/SKILL.md b/skills/access-control/SKILL.md index c5c900e..bc245c1 100644 --- a/skills/access-control/SKILL.md +++ b/skills/access-control/SKILL.md @@ -113,7 +113,7 @@ bindings: ``` **Constraints:** -- Each binding's permissions must be **sorted alphabetically and unique** (validation rule). +- Each binding's permissions must be **unique**. The API auto-sorts them alphabetically — you don't need to sort manually. - A policy can have up to **50 bindings**, each with up to **200 principal links**. - The same principal can appear in multiple bindings (different permission sets). @@ -278,6 +278,7 @@ bindings: ## Gotchas - **Policies fail silently when wrong.** A typo in `targetKind`, a missing principal link, or an invalid permission name produces a policy that exists but grants nothing. Always verify with `cpln policy access-report POLICY_NAME` after creation. +- **Permission ordering doesn't matter — the API auto-sorts.** You do not need to sort permissions alphabetically in your manifests; the platform sorts them on write. Duplicate permissions in the same binding will cause a validation error. - **Built-in policies cannot be modified or deleted.** Origins `builtin` are read-only; create your own with `default` origin. - **`reveal` (not `read`) is the permission for accessing secret values.** This is the most common permission-name mistake. - **Identity links are GVC-scoped.** Use `//gvc/GVC/identity/NAME`, not `//identity/NAME`. diff --git a/skills/autoscaling-capacity/SKILL.md b/skills/autoscaling-capacity/SKILL.md index 7cf078c..0c51a95 100644 --- a/skills/autoscaling-capacity/SKILL.md +++ b/skills/autoscaling-capacity/SKILL.md @@ -170,7 +170,19 @@ spec: ### Event-Driven KEDA (Standard, Redis Queue) +**Prerequisite:** KEDA must be enabled on the GVC before any workload can use `metric: keda`. Applying a workload with `metric: keda` to a GVC without KEDA enabled will silently not scale — no error event, the workload just ignores queue depth. + +```yaml +# Step 1: Enable KEDA on the GVC (one-time setup) +kind: gvc +name: my-gvc +spec: + keda: + enabled: true +``` + ```yaml +# Step 2: Configure the workload kind: workload name: queue-processor spec: diff --git a/skills/cpln/SKILL.md b/skills/cpln/SKILL.md index 14446ca..7863b32 100644 --- a/skills/cpln/SKILL.md +++ b/skills/cpln/SKILL.md @@ -105,6 +105,13 @@ cpln policy add-binding secret-access \ # Inject the secret into the workload cpln workload update my-app --gvc my-gvc \ --set spec.containers.main.env.DB_PASSWORD.value=cpln://secret/db-password.payload + +# ALWAYS verify the injection landed — --set exits 0 even if the container name +# doesn't match, silently writing to a path that doesn't exist in the spec. +cpln workload get my-app --gvc my-gvc -o json \ + | jq '.spec.containers[] | select(.name == "main") | .env' +# If DB_PASSWORD is absent from the output, the container name was wrong. +# Re-run with the correct name from: cpln workload get my-app --gvc my-gvc -o json | jq '[.spec.containers[].name]' ``` ## Workflow: GitOps with cpln apply diff --git a/skills/workload-security/SKILL.md b/skills/workload-security/SKILL.md index f383c98..858f15a 100644 --- a/skills/workload-security/SKILL.md +++ b/skills/workload-security/SKILL.md @@ -319,7 +319,7 @@ Full `spec.rolloutOptions` configuration: ### Critical Warnings -- If `sleep` is not available in **any** container, ALL containers receive SIGKILL immediately +- If `sleep` is not available in **any** container, ALL containers receive SIGKILL immediately — the entire grace period is skipped. This silently affects distroless images, scratch-based images, and some minimal Alpine builds. Verify with `cpln workload exec WORKLOAD --gvc GVC -- which sleep` before relying on the grace period. If `sleep` is absent, either add it to the image or configure an explicit preStop hook that does not depend on it. - If a custom preStop hook throws an error in **any** container, ALL containers receive SIGKILL immediately ### Custom PreStop Hook From e209dc77a60cf23d7f3e5b3e97fd6cdb54e072dc Mon Sep 17 00:00:00 2001 From: JerimiahCP Date: Tue, 28 Apr 2026 15:27:59 -0500 Subject: [PATCH 2/2] Remove .claude/ from repo and add to .gitignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit .claude/settings.json does not get merged into user settings on plugin install. Only the agent and subagentStatusLine keys are supported from a plugin-bundled settings.json — the hooks block is ignored by the harness. Hooks are correctly distributed via the hooks field in .claude-plugin/plugin.json, which the harness activates at runtime when the plugin is enabled. Adding .claude/ to .gitignore to prevent local session data from being committed. Co-Authored-By: Claude Sonnet 4.6 --- .claude/settings.json | 60 ------------------------------------------- .gitignore | 3 +++ 2 files changed, 3 insertions(+), 60 deletions(-) delete mode 100644 .claude/settings.json diff --git a/.claude/settings.json b/.claude/settings.json deleted file mode 100644 index be80e6e..0000000 --- a/.claude/settings.json +++ /dev/null @@ -1,60 +0,0 @@ -{ - "hooks": { - "PreToolUse": [ - { - "matcher": "Bash", - "hooks": [ - { - "type": "command", - "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+secret\\s+create\\b' && ! echo \"$cmd\" | grep -qE 'cpln\\s+secret\\s+create-'; then echo 'BLOCK: Use type-specific secret commands (cpln secret create-opaque, create-aws, create-tls, etc.). Generic cpln secret create does not exist.' >&2; exit 1; fi" - } - ] - }, - { - "matcher": "Bash", - "hooks": [ - { - "type": "command", - "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+apply' && ! echo \"$cmd\" | grep -qE -- '--file|--f\\b|-f\\b'; then echo 'BLOCK: cpln apply requires --file flag. Usage: cpln apply --file manifest.yaml' >&2; exit 1; fi" - } - ] - }, - { - "matcher": "Bash", - "hooks": [ - { - "type": "command", - "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+gvc\\s+delete-all-workloads'; then echo 'BLOCK: cpln gvc delete-all-workloads destroys every workload in the GVC. This command is too destructive to run from the AI layer. Confirm the org, GVC, and full blast radius in the conversation, then run this command manually in your terminal.' >&2; exit 1; fi" - } - ] - }, - { - "matcher": "Bash", - "hooks": [ - { - "type": "command", - "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+volumeset\\s+shrink'; then echo 'BLOCK: cpln volumeset shrink causes permanent data loss on the old volume. This command is too destructive to run from the AI layer. Confirm the org, GVC, volumeset name, and new size in the conversation, then run this command manually in your terminal.' >&2; exit 1; fi" - } - ] - }, - { - "matcher": "Bash", - "hooks": [ - { - "type": "command", - "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+\\w+\\s+list\\b'; then echo 'BLOCK: cpln list does not exist. Use cpln get (with no arguments to list all, or with a name to get one).' >&2; exit 1; fi" - } - ] - }, - { - "matcher": "Bash", - "hooks": [ - { - "type": "command", - "command": "input=$(cat); cmd=$(echo \"$input\" | jq -r '.tool_input.command // empty'); if echo \"$cmd\" | grep -qE 'cpln\\s+(workload|gvc|secret|identity|domain|policy|volumeset|serviceaccount|cloudaccount|agent|group|ipset|mk8s|image)\\s+delete\\b'; then echo 'WARNING: Destructive delete detected. Verify the correct org, GVC (if applicable), and resource name before proceeding. This action cannot be undone.' >&2; fi" - } - ] - } - ] - } -} diff --git a/.gitignore b/.gitignore index 1b3c42a..218ebcb 100644 --- a/.gitignore +++ b/.gitignore @@ -38,6 +38,9 @@ Thumbs.db .idea/ .vscode/ +# Claude Code local session data (hooks are distributed via .claude-plugin/plugin.json) +.claude/ + # Control Plane local artifacts *-bootstrap.json *.bak.yaml