Fix Codex session provider repair after provider switch#704
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an automatic “session visibility” repair path for Codex profiles when the effective provider changes (OAuth / API Key / Local API Service), ensuring rollout files and state_5.sqlite thread metadata don’t remain pinned to a stale provider after switching.
Changes:
- Add a directory-scoped repair API (
repair_session_visibility_for_dir) plus unit tests covering rollout + SQLite rewrites and no-op behavior. - Trigger automatic repairs after account switches and after enabling local access for the default Codex home.
- Trigger automatic repairs before instance launch when bound-account injection changes a profile’s provider, and fix a frontend timer type so TS build/typecheck passes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/stores/usePlatformLayoutStore.ts | Adjust timer handle typing for browser window.setTimeout usage to satisfy frontend typechecking. |
| src-tauri/src/modules/codex_session_visibility.rs | Introduce single-directory repair helper + add tests for rollout/SQLite provider repair behavior. |
| src-tauri/src/commands/codex.rs | Invoke automatic session visibility repair after default-home provider changes (account switch / local access activate). |
| src-tauri/src/commands/codex_instance.rs | Repair profile session visibility pre-launch when bound account injection alters provider. |
| CHANGELOG.zh-CN.md | Document the Codex provider-switch repair behavior. |
| CHANGELOG.md | Document the Codex provider-switch repair behavior (English). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let backup_dir = backup_instance_files( | ||
| data_dir, | ||
| &rollout_changes, | ||
| sqlite_rows_to_update > 0, | ||
| instance_id, | ||
| &target_provider, | ||
| )?; | ||
| let backup_dir_string = backup_dir.to_string_lossy().to_string(); |
| return Ok(CodexSessionVisibilityRepairItem { | ||
| instance_id: instance_id.to_string(), | ||
| instance_name: instance_name.to_string(), | ||
| target_provider, | ||
| changed_rollout_file_count: 0, | ||
| updated_sqlite_row_count: 0, | ||
| skipped_sqlite_file: sqlite_scan.skipped_unusable_database, | ||
| backup_dir: None, | ||
| running: false, | ||
| }); |
| match modules::codex_session_visibility::repair_session_visibility_for_dir( | ||
| profile_dir, | ||
| "__launch__", | ||
| "启动实例", | ||
| ) { |
|
感谢修这个问题。我本地遇到的现象应该和这个 PR 是同一类问题,但目前这个修复可能还漏了一个关键场景。 下面是我用本地只读监听脚本抓到的脱敏时间线,操作是在 Cockpit 的 Codex 页面里点击订阅账号 / 本地 API 服务的启动按钮: 危险状态是最后这一段:Codex 当前已经拿到本地 API 服务的 我本地观察到两个问题:
建议修法:
我本地临时修复脚本的逻辑大致是: 全量修复后,本地审计结果收敛为: |
35f26c7 to
c5ea9f7
Compare
|
Updated the PR branch with a narrower follow-up for the stale rollout metadata case. What changed:
Validation I could run locally:
Not run locally:
Extra local observation while reproducing: after the on-disk repair converged, an old orphaned Codex app-server process that started days earlier was still holding an old rollout file descriptor and could continue sending requests with a cached stale provider. The PR fix reduces the startup/switch window, but already-running old app-server processes may still need to be restarted once after applying the fix. |
Summary
Verification
Not run