fix(api): tolerate undecryptable rows in /api/settings/keys by JFK · Pull Request #52 · JFK/voicesrt

JFK · 2026-04-07T12:54:15Z

Summary

Catch decryption failures per row in `GET /api/settings/keys` instead of letting a single `InvalidToken` raise out of the endpoint and return HTTP 500
Surface broken rows with `decryption_error: true` + a masked placeholder so the UI can prompt re-entry
Add a unit test that seeds an undecryptable row and asserts the endpoint stays 200 with the failed row flagged

Why

Discovered while debugging: the dev server's Settings page rendered blank because `/api/settings/keys` returned 500. Root cause was a stale `api_key.openai` row that had been encrypted with a different `ENCRYPTION_KEY` and could no longer be decrypted (likely from a previous `.env` rotation).

The previous code:
```python
return [{"masked": _mask_key(decrypt(k.value)), ...} for k in keys]
```
A single `Fernet.InvalidToken` would raise out of the comprehension → 500 → entire Settings page unusable.

This is also the most likely root cause of the misleading "Model not found. Check Settings → LLM Models." error users see when an `api_key.*` row is undecryptable: `_get_credential` raises, the heuristic in `errors.py:131-137` mismatches, and the user gets a wrong direction.

Test plan

`pytest tests/test_settings_api.py` — 16 passing, including new `test_list_keys_tolerates_undecryptable_row`
`pytest --ignore=tests/e2e` — 202 passing, no regressions
`ruff check` / `ruff format --check` clean
Manual verification: restart dev container, hit `/settings`, confirm the page renders even with the stale row present

Follow-up

The misleading error translation in `src/errors.py:131-137` should still be addressed separately. See the v1.0.0 robustness Issues to be filed (LLM model validation + raw error preservation).

🤖 Generated with Claude Code

When ENCRYPTION_KEY is rotated (or a DB is restored from another env), existing api_key.* rows can no longer be decrypted. The previous implementation called decrypt() inside a list comprehension, so a single InvalidToken would raise out of the endpoint and return HTTP 500 — which in turn blanked the entire Settings page in the UI. Catch the exception per row, log a warning, and surface the row with `decryption_error: true` and a masked placeholder. The UI can then prompt re-entry instead of disappearing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Improves robustness of the Settings API by preventing a single undecryptable encrypted setting from crashing GET /api/settings/keys, enabling the Settings UI to load even when stale/invalid encrypted rows exist (e.g., after ENCRYPTION_KEY rotation).

Changes:

Update GET /api/settings/keys to handle decryption failures per-row, returning a masked placeholder and decryption_error: true instead of raising.
Add logging for failed decrypt attempts in list_keys.
Add a unit test that seeds an undecryptable row and asserts the endpoint stays HTTP 200 and flags the broken entry.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`src/api/settings.py`	Wrap per-row decryption in try/except; return placeholder + `decryption_error` on failure.
`tests/test_settings_api.py`	Add coverage ensuring `/api/settings/keys` tolerates an undecryptable encrypted row.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T13:06:56Z

src/api/settings.py

+        try:
+            entry["masked"] = _mask_key(decrypt(k.value))
+        except Exception as e:
+            # Decryption typically fails when ENCRYPTION_KEY has been rotated
+            # since this row was written. Surface the row in an error state
+            # so the UI can prompt re-entry, instead of crashing the whole
+            # endpoint and blanking the settings page.
+            logger.warning("Failed to decrypt %s: %s", k.key, e)
+            entry["masked"] = "****"
+            entry["decryption_error"] = True


The except Exception here will also swallow non-rotation failures (e.g., decrypt() raises RuntimeError when ENCRYPTION_KEY is unset/invalid in src/services/crypto.py), causing this endpoint to return 200 with placeholder data and potentially masking a fatal misconfiguration. Consider narrowing the exception to the specific decryption failures you want to tolerate (e.g., cryptography.fernet.InvalidToken, possibly decode/type errors) and letting unexpected exceptions propagate (or be handled separately) so real bugs/config issues aren’t hidden.

Copilot · 2026-04-07T13:06:57Z

tests/test_settings_api.py

+    # Insert a row with raw bytes that are NOT a valid Fernet token.
+    # Use delete-then-insert in case a previous test left a google row behind.
+    from sqlalchemy import delete
+
+    async with async_session() as s:
+        await s.execute(delete(Setting).where(Setting.key == "api_key.google"))
+        s.add(Setting(key="api_key.google", value="not-a-valid-fernet-token", encrypted=True))
+        await s.commit()


This test inserts value="not-a-valid-fernet-token" with encrypted=True, which isn’t actually “encrypted with a different ENCRYPTION_KEY” (it’s not a valid Fernet token at all). To better match the scenario described (key rotation), consider generating a syntactically valid Fernet token using a different Fernet key and inserting that instead, so the failure mode is specifically “valid token but cannot be decrypted with the current key.”

- Narrow exception in list_keys from `Exception` to `InvalidToken` so config errors (e.g. RuntimeError from missing ENCRYPTION_KEY in src/services/crypto.py) propagate instead of being silently masked as a "decryption_error" row - Replace the placeholder `not-a-valid-fernet-token` test fixture with a real Fernet token generated from a *different* key, so the test exercises the actual key-rotation scenario (valid token, signature fails) rather than a malformed-input path - Re-seed `api_key.openai` with a fresh test-key token at the start of the tolerance test so the assertion isn't poisoned by stale rows in a shared dev DB Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

JFK · 2026-04-07T13:49:33Z

Addressed both Copilot review comments in faa4dd0:

src/api/settings.py:62 — narrow exception ✅
Replaced except Exception with except InvalidToken. RuntimeError from crypto.py (missing ENCRYPTION_KEY) now propagates correctly instead of being masked as a decryption_error: true row. Added a comment documenting the rationale.
tests/test_settings_api.py:25 — realistic Fernet token ✅
The test now generates a syntactically valid Fernet token using a freshly-generated different key (Fernet.generate_key() → Fernet(foreign_key).encrypt(b"sk-foreign")), so the failure mode is "valid token, wrong key" — exactly the post-rotation scenario described in the PR. Also re-seeds api_key.openai with a fresh test-key token to keep the assertion stable in shared dev DBs.

pytest tests/test_settings_api.py → 16 passed. Full suite (202) clean.

Failed jobs now persist a structured error_detail JSON blob alongside the user-facing error_message. The keyword-based translation in classify_error() loses information and can mistranslate (e.g. an InvalidToken decrypt failure surfaces as "Model not found"); the new field keeps the original exception class, raw message, stage, provider, model, and timestamp so users can debug from the History page instead of reading server logs. - Add Job.error_detail Text column + Alembic migration - src/errors.py: build_error_detail / serialize_error_detail / parse_error_detail - transcribe.py: populate at refine/verify/metadata catch sites; resolve provider_name and refine/metadata model BEFORE the try block so the catch site does not re-query a possibly degraded session - jobs.py: populate at top-level _process_job and _generate_meta_job catch sites; expose error_detail in GET /api/jobs/{id} - templating.py: register parse_error_detail Jinja filter - job_status.html: add <details> "Show details" affordance - i18n: history.show_details (en/ja) - tests: 4 new tests covering field preservation, InvalidToken distinction (#52 scenario), serialize/parse round-trip, and full pipeline catch path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… (#56) * feat(observability): preserve raw API error in error_detail field (#54) Failed jobs now persist a structured error_detail JSON blob alongside the user-facing error_message. The keyword-based translation in classify_error() loses information and can mistranslate (e.g. an InvalidToken decrypt failure surfaces as "Model not found"); the new field keeps the original exception class, raw message, stage, provider, model, and timestamp so users can debug from the History page instead of reading server logs. - Add Job.error_detail Text column + Alembic migration - src/errors.py: build_error_detail / serialize_error_detail / parse_error_detail - transcribe.py: populate at refine/verify/metadata catch sites; resolve provider_name and refine/metadata model BEFORE the try block so the catch site does not re-query a possibly degraded session - jobs.py: populate at top-level _process_job and _generate_meta_job catch sites; expose error_detail in GET /api/jobs/{id} - templating.py: register parse_error_detail Jinja filter - job_status.html: add <details> "Show details" affordance - i18n: history.show_details (en/ja) - tests: 4 new tests covering field preservation, InvalidToken distinction (#52 scenario), serialize/parse round-trip, and full pipeline catch path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(observability): address Copilot review on PR #56 - _process_job: use get_provider_name(job.provider) for error_detail.provider so internal "whisper" doesn't leak through where other catch sites record the user-facing "openai". - _generate_meta_job: pre-compute provider_name and resolve effective model before the try block, then record exactly that pair in error_detail. - transcribe.py metadata step: pass the resolved metadata_model to _run_metadata_generation so the call, cost log, and error_detail all reference the same model (was re-querying _get_model internally). - errors.build_error_detail: emit occurred_at as second-precision ISO-8601 with Z suffix (e.g. 2026-04-07T15:32:18Z) instead of microsecond +00:00 for predictable UI rendering. - tests: assert the new occurred_at format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This was referenced Apr 7, 2026

feat(robustness): validate LLM model name before job submission #53

Closed

feat(observability): preserve raw API error in error_detail field #54

Closed

JFK requested a review from Copilot April 7, 2026 13:02

Copilot started reviewing on behalf of JFK April 7, 2026 13:03 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

JFK merged commit 8e502e3 into main Apr 7, 2026
2 checks passed

JFK deleted the fix/settings-keys-decrypt-tolerance branch April 7, 2026 13:51

JFK mentioned this pull request Apr 7, 2026

feat(observability): preserve raw API error in error_detail field (#54) #56

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api): tolerate undecryptable rows in /api/settings/keys#52

fix(api): tolerate undecryptable rows in /api/settings/keys#52
JFK merged 2 commits intomainfrom
fix/settings-keys-decrypt-tolerance

JFK commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

JFK commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JFK commented Apr 7, 2026

Summary

Why

Test plan

Follow-up

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

JFK commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants