Route Codex Responses through local converter#62
Conversation
PR Review: Route Codex Responses through local converter1.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee7fd12aa0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ): Record<string, string> { | ||
| const headers: Record<string, string> = { | ||
| "OpenAI-Beta": "responses=experimental", | ||
| "chatgpt-account-id": extractCodexAccountId(token), |
There was a problem hiding this comment.
Allow Codex account IDs stored outside the token
When codex-local returns a credential whose accessToken is opaque or simply lacks the chatgpt_account_id JWT claim, this header construction throws before any request is sent even though the Codex credential path already preserves accountId separately from auth.json/refreshOpenAICodexToken. Pass that stored account ID through the resolver/stream options, or fall back to it here, instead of requiring every bearer token to be self-decodable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 1069300.
CodexProviderApiKeyResolver now returns { apiKey, accountId } for codex-local, PiBridge forwards codexAccountId, and the Codex adapter uses that value before falling back to JWT claim extraction. Added regression coverage for an opaque bearer token with resolver-provided account id.
pnpm preflight passes locally.
| : undefined, | ||
| } | ||
| : event.response; | ||
| yield { ...event, type: "response.completed", response } as unknown as ResponseStreamEvent; |
There was a problem hiding this comment.
Preserve failed response details when normalizing Codex done events
If the Codex stream ends with response.done whose response.status is failed or cancelled, this remapping turns it into response.completed, so processResponsesStream skips its response.failed branch and the outer stream later throws the generic An unknown error occurred instead of the upstream error/incomplete details. Map failed/cancelled final statuses to the failure path, or throw with the embedded details here, so production failures remain diagnosable and retry classification can use the real error.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 1069300.
Codex terminal events now normalize response.done with failed or cancelled status to response.failed, so the shared Responses processor preserves upstream error and incomplete_details instead of collapsing to a generic unknown error. Added regression coverage for the failed response.done path.
pnpm preflight passes locally.
| tool_choice: "auto", | ||
| parallel_tool_calls: true, |
There was a problem hiding this comment.
Forward the Codex output token cap
For openai-codex-responses models, the new local request shape never includes max_output_tokens, so any maxOutputTokens configured on the model is ignored on exactly the Codex path this commit reroutes; a low cap intended to control cost or latency will no longer be enforced while the non-Codex Responses branch still forwards model.maxTokens. Include the configured token budget in the Codex params, or map it to the Codex backend's equivalent field if it differs.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 1069300.
The Codex adapter now forwards max_output_tokens: model.maxTokens in buildOpenAICodexResponsesParams, and the request-shape regression test asserts the configured model budget is sent on the Codex path.
pnpm preflight passes locally.
|
Please review the current head The intended design is now:
Please focus the review on whether this boundary is right for Pokoclaw long-term:
The PR description has been rewritten with the root cause, design rationale, and validation details. Local |
Architecture Review of
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1069300c18
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| : type === "response.incomplete" | ||
| ? "incomplete" | ||
| : undefined; | ||
| const status = normalizeCodexResponseStatus(response.status, type) ?? fallbackStatus; |
There was a problem hiding this comment.
Require status on Codex done events
When a Codex terminal response.done event omits response.status due to upstream shape drift, this expression leaves status undefined and the adapter still emits response.completed; the shared processor then maps the missing status to a successful stop, so a malformed terminal event can silently complete instead of failing at the Codex boundary. Since response.done is the critical event that decides success vs failure, require a known status for that event rather than falling through with undefined.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 12e6010.
response.done now requires a known Codex response.status; a missing status fails at the adapter boundary instead of being normalized to a successful completion. Added a regression test in providers.codex-stream.test.ts for the missing-status terminal event.
pnpm preflight passes locally.
| yield normalizeCodexTerminalEvent(event, type); | ||
| continue; | ||
| case "response.output_item.added": | ||
| case "response.output_item.done": |
There was a problem hiding this comment.
Validate finalized Codex output items
Because response.output_item.done reuses the same loose validator as added, a done event for a message or reasoning item can omit the finalized content/summary and still reach the shared processor; that processor overwrites the accumulated streamed block from the done item (item.content ?? [] / item.summary ?? []), so a Codex shape change that sends deltas but not final content would silently erase the generated text or reasoning. Keep the permissive shape for added, but require the finalized fields on output_item.done before passing it on.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 12e6010.
The Codex adapter now keeps added permissive but treats response.output_item.done as finalized: message items must include content, reasoning items must include summary, and those arrays are validated before they reach the shared Responses processor. This prevents streamed content from being overwritten by an incomplete final item. Added focused regression tests for both message and reasoning done events.
pnpm preflight passes locally.
| const requestServiceTier = getRequestServiceTier(options); | ||
| if (requestServiceTier) { | ||
| params.service_tier = requestServiceTier; |
There was a problem hiding this comment.
Preserve requested tier for Responses pricing
When a normal openai-responses model is configured with serviceTier = "fast" or "flex", this now sends the requested service_tier, but the streaming path still only passes completedServiceTier into processResponsesStream; if the response reports default/null or omits service_tier, the shared processor's existing request-tier fallback receives undefined, so usage cost is recorded at base price even though the request used priority/flex. Keep the requested tier and merge it the same way the Codex adapter does before applying service-tier pricing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 12e6010.
The generic openai-responses stream path now keeps the request service tier separately from the response tier and resolves default/missing response tiers back to the requested tier before applying estimated pricing. Added a regression test where the request uses priority but the upstream completion reports service_tier: "default".
pnpm preflight passes locally.
|
Please review the latest head I handled the new inline and overall comments with the same TDD flow: added failing regression coverage first, fixed the provider/bridge code, then ran full local What changed since
I also updated the PR description to include the latest root-cause/design notes, service-tier ownership, and the new regression coverage. Local verification on
For this pass, please re-check both the specific fixes and the architecture boundary: Codex protocol behavior should stay in |
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@claude review please |
ReviewOverall: Clean, well-tested PR. No blocking issues. The architecture boundary is correctly drawn — Codex transport is now isolated in 1.
|
|
@claude Addressed your follow-up notes in
No runtime behavior changed. Local Please review again. |
Problem
This PR started from a production failure in the
openai-codex-responsespath:Duplicate item found with id msg_1.@mariozechner/pi-aiCodex converter, so cross-provider assistant history could still emit duplicate fallback response item ids.The first fix routed Codex Responses through the local converter, which addressed the production duplicate-id failure. Review then exposed a larger design issue: the implementation had mixed Codex-specific backend behavior into the generic OpenAI upstream adapter. That made the code harder to reason about and left several edge cases fragile.
Root Cause
There are two different layers that looked similar but have different contracts:
src/agent/llm/upstream-openai.ts.response.done.The buggy version treated Codex as a small branch inside the generic OpenAI Responses path. That was the wrong boundary. It forced
upstream-openai.tsto know aboutchatgpt-account-id, Codex account extraction, Codex reasoning effort clamping, Codex service-tier pricing, and Codex stream normalization. It also meant review fixes could easily become scattered conditionals rather than a coherent provider adapter.Approach
This PR now routes
openai-codex-responsesthrough a dedicated Codex adapter insrc/agent/llm/providers/codex/stream.ts.The adapter owns the Codex-specific behavior end to end:
convertResponsesMessagesconvertermax_output_tokensfrom the configured model budgettools,tool_choice, andparallel_tool_callswhen visible tools are actually presentinclude,text,instructions, reasoning summary, and reasoning effort behavior in one placechatgpt-account-idresponse.done/response.incomplete/response.completedevents for the shared Responses stream processorresponse.doneso malformed terminal events cannot silently become successresponse.failedso upstream error details are preservedprocessResponsesStreamcontentand reasoningsummaryonresponse.output_item.donebecause the shared processor replaces the streamed block with the finalized itemitem.ideven for unknown Codex output item typessrc/agent/llm/upstream-openai.tsis now back to the generic OpenAI-compatible boundary. It dispatches Codex to the Codex adapter and no longer contains Codex-specific header, event, or request-shape logic.The generic OpenAI Responses path also now keeps the requested service tier available through the stream processor and final usage normalization. If the upstream response reports
default, omitsservice_tier, or otherwise lacks the effective requested tier, estimated pricing still uses the requested priority/flex tier that was actually sent.Credential Design
codex-localcredentials can store the account id separately from the access token. The previous implementation assumed every bearer token could be decoded as a JWT with achatgpt_account_idclaim. That is not a stable contract.The resolver now returns a structured credential for Codex-local auth:
apiKey: the access tokenaccountId: the stored Codex account id when availablePiBridgeforwards that account id through stream options, and the Codex adapter uses it forchatgpt-account-id. JWT extraction remains only as a fallback for configured API keys that still carry the claim.Streaming And Non-Streaming Design
The production failure surfaced in streaming task execution, but leaving
completeTurn/completeTexton the bundled pi-ai Codex path would preserve a second, divergent Codex converter. That would be a latent bug.For
openai-codex-responses, non-streaming bridge calls now drain the same local streaming adapter and consume the finalAssistantMessage. That keeps Codex conversion, request construction, and event processing on one path across task runs, normal chat turns, and compaction-style calls. Error-path coverage now includes the non-streaming bridge path as well, so a failed Codex terminal event must reject instead of resolving as a generic message.Service-Tier Ownership
Service-tier handling now has explicit ownership:
options.serviceTierand setservice_tierdirectly in the request params.onPayloadhook remains only as compatibility forstreamSimpleadapters that do not know about Pokoclaw'soptions.serviceTierextension. Its comment now documents that purpose.This keeps the runtime contract visible without removing compatibility for adapters still delegated to pi-ai.
Why This Shape
The main architectural choice is to treat Codex as a provider-specific adapter, not as a feature flag inside generic OpenAI Responses.
That boundary is important because:
ResponseStreamEventThis is not only fixing the duplicate id symptom. It removes the design condition that let this class of bug survive in one path while being fixed in another.
Review Focus
Please review this PR at the architecture boundary level, not only at individual comment details:
providers/codex/stream.tsis the right ownership boundary for Codex transport behaviorupstream-openai.tsis now appropriately generic againCodexProviderApiKeyResolverthroughPiBridgeinto the adapter is explicit enoughstreamSimplecompatibilityTests
Local verification:
pnpm preflightRegression coverage added or updated:
response.donepreserving upstream error detailsresponse.donerequiring a known statusmax_output_tokensdefaultProduction Validation
Before this broader cleanup, the branch was temporarily deployed to the SSH host and the user confirmed the runtime test passed for the original scheduled-task failure. This PR now keeps that production fix while tightening the architecture and regression coverage around it.