Skip to content

fix(codex): surface remote auth and app-server errors#483

Open
Shujakuinkuraudo wants to merge 1 commit intotiann:mainfrom
Shujakuinkuraudo:fix/codex-error-passthrough
Open

fix(codex): surface remote auth and app-server errors#483
Shujakuinkuraudo wants to merge 1 commit intotiann:mainfrom
Shujakuinkuraudo:fix/codex-error-passthrough

Conversation

@Shujakuinkuraudo
Copy link
Copy Markdown
Contributor

Summary

  • preserve upstream Codex app-server stderr and JSON-RPC error metadata
  • allow anonymous task_failed events with real error details to pass terminal turn guards
  • forward detailed remote Codex failures into session messages so web/remote UI can show auth and upstream errors

Problem

Real Codex failures such as 401 auth errors or upstream high-demand errors were often visible only in Codex stderr, while HAPI remote/web UI showed only generic failures or nothing useful.

A concrete failure mode was:

  • app-server emitted a task_failed-like failure without turn_id
  • remote launcher treated it as stale because a turn was active
  • detailed error text was dropped before reaching session messages

Changes

  • add richer CodexAppServerError metadata: stderrTail, stage, exit code, signal, error code, retryability
  • capture stderr tails from app-server / spawn failures
  • preserve stderr, exit code, retryability in app-server event conversion
  • allow anonymous failed terminal events with details through the terminal event guard
  • emit detailed task_failed text into session messages for web/mobile visibility
  • use piped stderr in local Codex mode so upstream stderr can be surfaced consistently

Validation

  • bun test cli/src/codex/utils/terminalEventGuard.test.ts
  • bun typecheck
  • bun run build:single-exe
  • deployed built binary to a live server and verified via https://hapi.server4.shujk.top
  • confirmed a real remote Codex auth failure now appears in session messages, e.g. Task failed: unexpected status 401 Unauthorized ...

@Shujakuinkuraudo
Copy link
Copy Markdown
Contributor Author

Shujakuinkuraudo commented Apr 16, 2026

验证环境:

  • Codex CLI
  • remote spawn → hub → web/session messages 全链路

验证方式:

  1. 临时写入坏 key,制造真实 401
  2. 通过正式域名 API:
    • /api/auth
    • /api/machines/:id/spawn
    • /api/sessions/:id/messages
  3. 启动 remote codex session,发送一条消息
  4. 轮询 session messages,确认 401 已透传到 UI
  5. 验证后恢复原 auth.json

实际拿到的 session message 示例:

{
  "type": "message",
  "message": "Task failed: unexpected status 401 Unauthorized: 未提供令牌 (request id: 20260416201413848500688268d9d6Tx1mkANZ), url: https://**************/v1/responses"
}

说明:

  • 401 细节已进入 HAPI session messages
  • web / mobile / remote UI 现在可以显示真实上游错误

已完成的本地校验:

  • bun test cli/src/codex/utils/terminalEventGuard.test.ts
  • bun typecheck
  • bun run build:single-exe
c6942005115e3bd318203d864f3f6106_750

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Local Codex stderr is now hidden from the interactive terminal. Switching stdio[2] to pipe at cli/src/codex/codexLocal.ts:88 routes stderr into spawnWithAbort, but that helper only buffers child.stderr and appends it to the eventual exit error instead of forwarding it live (cli/src/utils/spawnWithAbort.ts:66, cli/src/utils/spawnWithAbort.ts:157). That means local users no longer see auth failures and other stderr diagnostics as they happen.
    Suggested fix:
    await spawnWithTerminalGuard({
        ...,
        stdio: ['inherit', 'inherit', 'inherit']
    });

Summary
Review mode: initial. 1 major regression found in local Codex stderr handling.

Testing
Not run (automation): bun is not installed in this runner, so the targeted CLI tests could not be executed.

HAPI Bot

logExit: true,
shell: process.platform === 'win32'
shell: process.platform === 'win32',
stdio: ['inherit', 'inherit', 'pipe']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] Piping stderr here hides it from local interactive users. In this path spawnWithAbort only buffers child.stderr for the final rejection text and never writes it back to process.stderr, so auth failures and other diagnostics disappear until the process exits.

Suggested fix:

await spawnWithTerminalGuard({
    ...,
    stdio: ['inherit', 'inherit', 'inherit']
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant