Skip to content

crypto: versioned seed-phrase KDF (v2 = PBKDF2-600k) with trial-based legacy detection#232

Merged
cvince merged 2 commits into
mainfrom
crypto/versioned-kdf
Jun 9, 2026
Merged

crypto: versioned seed-phrase KDF (v2 = PBKDF2-600k) with trial-based legacy detection#232
cvince merged 2 commits into
mainfrom
crypto/versioned-kdf

Conversation

@cvince

@cvince cvince commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What

Upgrades the seed-phrase KDF and makes it versioned, so we can strengthen M's derivation without re-keying or locking out existing owners.

  • v1 (legacy, unchanged): PBKDF2(phrase, "capy-mnemonic", 2048, SHA-512) — the BIP-39 default.
  • v2 (new, current): PBKDF2(phrase, "capy-mnemonic-v2", 600_000, SHA-256) — OWASP 2023 guidance.

New orgs and local-mode setups derive M under v2. Existing orgs keep deriving under v1.

Why this shape (the honest fork, not a "transparent migration")

M is the root of the whole key tree (deriveProjectKey = HKDF(M, …)), and it's derived deterministically from the phrase with a fixed salt — there's nowhere durable to store a per-user salt or a version marker, because capy recover must rebuild M from the 24 words alone, offline, after a full machine wipe.

That makes a true "upgrade everyone's KDF in place" impossible without either re-encrypting all data or storing a key blob server-side (a zero-trust posture change). So instead:

  • New orgs get v2. Existing orgs stay v1 forever — their M value is bound to v1 and must never change.
  • The KDF version is not stored anywhere. At the three phrase→M boundaries (decrypt, recover, create) we detect the version by trial decryption against a piece of the org's own ciphertext.

No data migration, no server change, no stored marker.

⚠️ Trade-off, stated plainly: this strengthens the KDF for new orgs only. Existing owners are grandfathered on v1. Truly upgrading them would require re-encryption or a server-side envelope key — out of scope here by design.

Changes

  • keyManagerKdfVersion type + CURRENT_KDF_VERSION / KDF_VERSIONS registry; seedPhraseToMasterKey(phrase, version?) defaults to current. (Adding a future v3 = Argon2id is a one-case change; PBKDF2 chosen to keep the standalone pkg binary free of native addons.)
  • keyResolverresolveProjectKeyByTrial() returns the version whose project key decrypts a given ciphertext; resolveFromSeedPhrase gains an optional version.
  • decrypt — trials versions against the encrypted .env oracle; caches the resolved M in the recovery session.
  • recover — trials against a ciphertext fetched from the server. This also fixes a long-standing gap: recover used to write a key.enc for a wrong phrase silently and only fail later. A phrase that matches nothing in the org is now rejected before anything is written. (If the org has no stored secrets, there's no oracle — it warns and writes under the current version.)
  • orgCreation — new orgs pinned to CURRENT_KDF_VERSION explicitly.

Tests

  • Golden vectors pin the exact v1 and v2 outputs, so a parameter drift that would lock owners out fails CI.
  • Migration proof (tests/crypto/kdfMigration.test.ts): a v1 org's ciphertext is correctly resolved under the new code; a v2 org resolves to v2; cross-version keys are isolated; a wrong phrase resolves to null.
  • Real-wiring recover (tests/commands/recoverKdf.test.ts): drives RecoverCommand end-to-end and asserts it writes the correct-version M (v1 and v2) and refuses a non-matching phrase.
  • Full suite green (405 tests), typecheck + build clean.

…al-based legacy detection

Upgrade the seed-phrase KDF from PBKDF2-2048/SHA-512 (v1, the BIP-39
default) to PBKDF2-600k/SHA-256 (v2, OWASP) for defense-in-depth. The
24-word mnemonic already carries 256-bit entropy, so v1 was never
brute-forceable; v2 closes the gap for partially-leaked / non-uniform
phrases and satisfies OWASP guidance.

M is derived deterministically from the phrase with no stored salt
(recovery must work offline from the 24 words alone after a machine
wipe), so the KDF version cannot be persisted. New orgs/local setups
derive M under the current version; existing orgs stay on their original
version forever and are detected at the phrase->M boundaries by trial
decryption against a known ciphertext — no data migration, no server
change, no stored version marker.

- keyManager: KdfVersion + CURRENT_KDF_VERSION/KDF_VERSIONS registry;
  seedPhraseToMasterKey(phrase, version?) defaults to current.
- keyResolver: resolveProjectKeyByTrial() picks the version whose key
  decrypts a given ciphertext; resolveFromSeedPhrase takes a version.
- decrypt: trials versions against the encrypted .env oracle.
- recover: trials against a fetched ciphertext oracle, which also fixes
  the long-standing gap where a wrong phrase wrote a bad key.enc
  silently — it is now rejected before anything is written.
- orgCreation: new orgs pinned to CURRENT_KDF_VERSION explicitly.
- Tests: golden vectors pin v1+v2 params; migration tests prove a v1
  org decrypts under the new code and the trial resolver picks the
  right version / refuses a wrong phrase (crypto + real-wiring recover).
encryptMasterKey/decryptMasterKey now bind Additional Authenticated Data
so a wrapped master-key blob can't be verified under a different (user,
org) or moved between the org and local-only keystores:

- org wrapping (key.enc): AAD = masterKeyAAD(userId, orgId)
- local-only keystore (key.local): AAD = LOCAL_MASTER_KEY_AAD (domain tag)

decryptMasterKey verifies against the supplied AAD first, then falls back
to a no-AAD decrypt for blobs written before AAD binding existed — a
transparent grandfather. A blob written WITH an AAD never verifies under
a different one (wrong-AAD attempt fails the GCM tag and the no-AAD
fallback also fails), so cross-context substitution is rejected; only
genuinely AAD-less legacy blobs reach the fallback. Every reader of a
master-key blob (keyResolver, invite, transport, local unlock) now passes
the matching AAD.

The other AES-GCM call sites in packages/cli/src/crypto/ — value
encryption (encryptor), invite/deploy wrapping (inviteCrypto,
deployCrypto/deployRuntime) — keep no AAD by design: each derives its key
via HKDF with the context already in the salt/info (projectId+orgId,
orgId:email, deployId), so AAD would be redundant, and binding it on
stored values would require re-encrypting everything. Each site is now
commented to that effect. Service-side kms.ts already binds AAD via
contextAAD / EncryptionContext (unchanged).

Tests (tests/crypto/aeadAad.test.ts): round-trip under AAD; altered AAD
fails; cross-context substitution fails; org/local domain separation;
legacy no-AAD blobs still decrypt; AAD-bound blob refuses a reader that
omits the AAD. Full suite green (413), typecheck + build clean.
@cvince cvince merged commit 486b2aa into main Jun 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant