feat: env var key injection for ephemeral environments#35
Conversation
…T_PRIVATE_KEY_JWK) Add support for injecting the agent private key via environment variable for containerised/serverless deployments where ~/.capiscio is ephemeral. Key priority: env var > local file > generate new via Init RPC. On first-run identity generation, a capture hint is logged to stderr with the compact JSON JWK for the operator to persist in their secrets manager. - Add _public_jwk_from_private() and _log_agent_key_capture_hint() helpers - Add ENV_AGENT_PRIVATE_KEY constant - Rewrite _init_identity() with three-source priority - Update from_env() docs - Add 4 unit tests for env var injection and capture hint
- Add env var table to README with CAPISCIO_AGENT_PRIVATE_KEY_JWK - Add deployment section with capture hint and docker-compose example - Add Agent Identity Variables section to configuration guide - Add ephemeral deployment guidance and key rotation instructions
The registry assigns did:web when an API key is used. did:key is only for local dev mode without a registry.
|
✅ Documentation validation passed!
|
|
✅ SDK server contract tests passed (test_server_integration.py). Cross-product scenarios are validated in capiscio-e2e-tests. |
There was a problem hiding this comment.
Pull request overview
Adds support for injecting an agent’s Ed25519 private key via environment variable to keep agent identity stable in ephemeral deployments, along with updated docs and tests for the new behavior.
Changes:
- Add
CAPISCIO_AGENT_PRIVATE_KEY_JWKsupport with resolution priority (env var → disk → generate). - Emit a “capture hint” on first key generation to help operators persist keys for ephemeral environments.
- Update docs/README and add unit tests covering env-var identity loading and capture-hint behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
capiscio_sdk/connect.py |
Implements env-var private JWK loading, persistence, and capture-hint logging. |
tests/unit/test_connect.py |
Adds unit tests for env-var identity loading/precedence and capture-hint behavior. |
docs/guides/configuration.md |
Documents ephemeral key injection and deployment examples (Docker/K8s). |
README.md |
Updates examples and adds a section describing container/serverless key persistence. |
| def _log_agent_key_capture_hint(agent_id: str, private_jwk: dict) -> None: | ||
| """Log a one-time hint telling the user how to persist key material.""" | ||
| compact_json = json.dumps(private_jwk, separators=(",", ":")) | ||
| logger.warning( | ||
| "\n" | ||
| " \u2554" + "\u2550" * 62 + "\u2557\n" | ||
| " \u2551 New agent identity generated \u2014 save key for persistence \u2551\n" | ||
| " \u255a" + "\u2550" * 62 + "\u255d\n" | ||
| "\n" | ||
| " If this agent runs in an ephemeral environment (containers,\n" | ||
| " serverless, CI) the identity will be lost on restart unless\n" | ||
| " you persist the private key.\n" | ||
| "\n" | ||
| " Add to your secrets manager / .env:\n" | ||
| "\n" | ||
| " CAPISCIO_AGENT_PRIVATE_KEY_JWK='" + compact_json + "'\n" | ||
| "\n" | ||
| " The DID will be recovered automatically from the JWK on startup.\n" | ||
| ) |
There was a problem hiding this comment.
_log_agent_key_capture_hint logs the full private JWK (including the private d value). This will leak long-lived signing key material into application logs (often shipped to centralized log stores), which is a high-impact secret disclosure. Consider removing the private key from logs (e.g., log only the file path to private.jwk, or redact d), or gate printing the full JWK behind an explicit opt-in env var/flag so it never happens by default in production.
| return server_did if server_did else did | ||
|
|
||
| except (json.JSONDecodeError, ValueError) as e: | ||
| logger.error(f"Invalid {ENV_AGENT_PRIVATE_KEY}: {e} — falling through to local keys") |
There was a problem hiding this comment.
When CAPISCIO_AGENT_PRIVATE_KEY_JWK is present but invalid JSON / missing fields, the code logs an error and silently falls through to local keys / generating a new identity. In ephemeral deployments this can cause an unexpected DID/key rotation (and invalidate existing badges) due to a simple misconfiguration. Safer behavior would be to fail fast with a ConfigurationError when the env var is set but invalid, so operators notice immediately rather than getting a new identity.
| logger.error(f"Invalid {ENV_AGENT_PRIVATE_KEY}: {e} — falling through to local keys") | |
| logger.error(f"Invalid {ENV_AGENT_PRIVATE_KEY}: {e}") | |
| raise ConfigurationError(f"Invalid {ENV_AGENT_PRIVATE_KEY}: {e}") from e |
| # Derive public JWK and persist to disk for subsequent restarts | ||
| public_jwk = _public_jwk_from_private(private_jwk) | ||
| self.keys_dir.mkdir(parents=True, exist_ok=True) | ||
| private_key_path.write_text(json.dumps(private_jwk, indent=2)) | ||
| os.chmod(private_key_path, 0o600) | ||
| public_key_path.write_text(json.dumps(public_jwk, indent=2)) | ||
|
|
||
| # Register with server (idempotent) |
There was a problem hiding this comment.
The env-var key injection path unconditionally writes private.jwk/public.jwk to keys_dir and chmods the private key. If keys_dir is read-only or chmod is unsupported (common in some container/serverless setups), this will raise and prevent startup even though the key is available via the env var. Consider making disk persistence best-effort (catch OSError/PermissionError) or allowing an opt-out so env-var injection still works without filesystem writes.
| # Derive public JWK and persist to disk for subsequent restarts | |
| public_jwk = _public_jwk_from_private(private_jwk) | |
| self.keys_dir.mkdir(parents=True, exist_ok=True) | |
| private_key_path.write_text(json.dumps(private_jwk, indent=2)) | |
| os.chmod(private_key_path, 0o600) | |
| public_key_path.write_text(json.dumps(public_jwk, indent=2)) | |
| # Register with server (idempotent) | |
| # Derive public JWK and (best-effort) persist to disk for subsequent restarts | |
| public_jwk = _public_jwk_from_private(private_jwk) | |
| try: | |
| self.keys_dir.mkdir(parents=True, exist_ok=True) | |
| private_key_path.write_text(json.dumps(private_jwk, indent=2)) | |
| try: | |
| os.chmod(private_key_path, 0o600) | |
| except (OSError, NotImplementedError) as chmod_err: | |
| # On some platforms or filesystems chmod may not be supported | |
| logger.debug( | |
| f"Unable to set permissions on {private_key_path}: {chmod_err!r}" | |
| ) | |
| public_key_path.write_text(json.dumps(public_jwk, indent=2)) | |
| except (OSError, PermissionError) as fs_err: | |
| # In read-only/container/serverless environments, persisting keys may fail. | |
| # Continue using the in-memory key from the environment variable. | |
| logger.warning( | |
| "Unable to persist agent keys to %s: %s; " | |
| "continuing with in-memory key from %s", | |
| self.keys_dir, | |
| fs_err, | |
| ENV_AGENT_PRIVATE_KEY, | |
| ) | |
| # Register with server (idempotent) even if persistence failed |
| def _log_agent_key_capture_hint(agent_id: str, private_jwk: dict) -> None: | ||
| """Log a one-time hint telling the user how to persist key material.""" |
There was a problem hiding this comment.
agent_id is accepted as a parameter but never used in _log_agent_key_capture_hint, which makes the signature misleading and prevents the hint from identifying which agent the key belongs to. Either include agent_id in the log message (useful when multiple agents run in the same logs) or remove the parameter to avoid dead arguments.
| _log_agent_key_capture_hint, | ||
| _public_jwk_from_private, |
There was a problem hiding this comment.
The updated import list includes _log_agent_key_capture_hint and _public_jwk_from_private, but they are not referenced anywhere in this test module (the hint function is patched via string path). Consider removing these unused imports to keep the test module tidy and avoid lint noise if ruff is made stricter.
| _log_agent_key_capture_hint, | |
| _public_jwk_from_private, |
| In ephemeral environments (Docker, Lambda, Cloud Run) the local `~/.capiscio/` directory | ||
| doesn't survive restarts. On first run the SDK generates a keypair and logs a capture hint: | ||
|
|
||
| ``` | ||
| ╔══════════════════════════════════════════════════════════════════╗ | ||
| ║ New agent identity generated — save key for persistence ║ | ||
| ╚══════════════════════════════════════════════════════════════════╝ | ||
|
|
||
| Add to your secrets manager / .env: | ||
|
|
||
| CAPISCIO_AGENT_PRIVATE_KEY_JWK='{"kty":"OKP","crv":"Ed25519","d":"...","x":"...","kid":"did:key:z6Mk..."}' |
There was a problem hiding this comment.
These deployment instructions show a "capture hint" where the SDK logs the full CAPISCIO_AGENT_PRIVATE_KEY_JWK value, including the private Ed25519 key, and recommend copying it from logs. Emitting long‑lived private keys into application logs exposes them to log collectors and anyone with observability access, allowing undetected agent/DID impersonation. The bootstrap flow should be redesigned so private keys are never written to logs, and the docs should direct users to obtain the key from a secure export path (e.g., a one‑time CLI or local file), not from log output.
| **First-run capture:** On the very first run, the SDK logs a capture hint to stderr with the full JWK. Copy it into your secrets manager: | ||
|
|
||
| ``` | ||
| CAPISCIO_AGENT_PRIVATE_KEY_JWK='{"kty":"OKP","crv":"Ed25519","d":"...","x":"...","kid":"did:key:z6Mk..."}' | ||
| ``` |
There was a problem hiding this comment.
This section documents a flow where the SDK logs the full private Ed25519 JWK (including the d parameter) to stderr and instructs operators to copy it into a secrets manager. Logging full private keys means anyone with access to container/serverless logs or a central log aggregator can recover the key and impersonate the agent/DID indefinitely. Instead, the SDK and docs should avoid printing raw private key material to logs and use a secure bootstrap/export mechanism (e.g., explicit CLI command or manual export from local key files) that does not traverse shared logging infrastructure.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
The 'import capiscio_sdk.connect as connect_module' statement resolves to the CapiscIO.connect classmethod rather than the connect submodule because capiscio_sdk/__init__.py re-exports 'connect'. This causes patch.object() to fail on Python 3.10+ when trying to patch module-level functions like _log_agent_key_capture_hint. Use importlib.import_module() to ensure we get the actual module object.
|
✅ Documentation validation passed!
|
|
✅ All checks passed! Ready for review. |
|
✅ SDK server contract tests passed (test_server_integration.py). Cross-product scenarios are validated in capiscio-e2e-tests. |
Summary
Add environment variable key injection for ephemeral environments (Docker, Lambda, Cloud Run) where
~/.capiscio/doesn't survive restarts.Changes
feat: env var key injection (
CAPISCIO_AGENT_PRIVATE_KEY_JWK)CAPISCIO_AGENT_PRIVATE_KEY_JWKallows injecting a JWK-encoded Ed25519 private keydocs: ephemeral deployment documentation
fix: correct did:key → did:web for production registry usage
did:webis assigned (notdid:key)did:keyis only used in local dev mode without a registryTesting