Skip to content

fix: restore chat completion multi-choice support#672

Open
nabinchha wants to merge 3 commits into
mainfrom
nmulepati/fix-620-chat-completion-choices-n
Open

fix: restore chat completion multi-choice support#672
nabinchha wants to merge 3 commits into
mainfrom
nmulepati/fix-620-chat-completion-choices-n

Conversation

@nabinchha
Copy link
Copy Markdown
Contributor

@nabinchha nabinchha commented May 18, 2026

📋 Summary

Restores chat completion multi-choice compatibility by reintroducing the n request field and preserving all returned choices in the canonical response. Existing single-choice callers can continue using response.message, while callers that request multiple completions can use response.choices.

🔗 Related Issue

Fixes #620

🔄 Changes

  • Add ChatCompletionRequest.n and export a new ChatCompletionChoice response type.
  • Preserve every returned chat completion choice in ChatCompletionResponse.choices, while keeping response.message as the first choice for existing callers.
  • Parse all OpenAI-compatible chat completion choices instead of discarding everything after index 0.
  • Forward n through ModelFacade.completion() and OpenAI-compatible transport bodies, while excluding it from Anthropic request forwarding.
  • Strip n from generate() and agenerate() before delegating to completion because those APIs return one parsed result.
  • Add tests for request forwarding, multi-choice parsing, async parsing, compatibility access, Anthropic n exclusion, and generate(..., n=...) stripping.

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

  • types.py — updates the canonical chat completion response contract while preserving response.message.
  • facade.pycompletion(..., n=...) exposes multiple choices, while generate(..., n=...) strips n because it returns a single parsed result.

🧪 Testing

  • make test passes (not run; focused model suite was run instead)
  • PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-engine/tests/engine/models -q passes (532 passed)
  • uv run --group dev ruff check ... passes for touched files
  • Unit tests added/updated
  • E2E tests added/updated (N/A — no E2E surface)

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated (N/A — no architecture docs needed)

@nabinchha nabinchha requested a review from a team as a code owner May 18, 2026 16:27
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR restores multi-choice support for chat completions by reintroducing the n request field and wiring it through the entire stack — from ChatCompletionRequest through transport, response parsing, and the public ChatCompletionResponse shape. Existing single-choice callers are unaffected because response.message continues to hold the first choice, while new callers can iterate response.choices.

  • types.py: Adds ChatCompletionChoice, extends ChatCompletionResponse with a choices list and a messages property, and uses __post_init__ to guarantee backward compatibility when choices is omitted.
  • parsing.py: Refactors both the sync and async parsers to iterate over all returned choices via new helpers (normalize_choice_list, parse_chat_completion_choice, etc.) instead of extracting only index 0.
  • facade.py / adapters/anthropic.py: Forwards n through completion() and OpenAI-compatible transports while explicitly stripping it from generate()/agenerate() (single-result APIs) and from Anthropic requests (unsupported field).

Confidence Score: 5/5

Safe to merge — backward compatibility is maintained through response.message and the __post_init__ guard, all three transport paths (OpenAI-compatible, Anthropic, facade) are correctly wired, and the new behavior is well-covered by tests.

The change is well-scoped: each modified function has a clear before/after, the n-stripping in generate()/agenerate() is correct and tested, and Anthropic exclusion is explicitly verified. No regressions were found in the parsing, transport, or facade layers.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/types.py Adds ChatCompletionChoice dataclass and extends ChatCompletionResponse with a choices list and a messages property; backward compat preserved via __post_init__ guard.
packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py Refactors response parsing to iterate over all choices instead of only the first; adds normalize_choice_list, parse_chat_completion_choice, and async counterparts.
packages/data-designer-engine/src/data_designer/engine/models/facade.py Adds n to _COMPLETION_REQUEST_FIELDS so it forwards through completion(); strips n inside generate()/agenerate() loops because those methods return a single parsed result.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/anthropic.py Adds n to the Anthropic adapter's excluded-fields list, correctly preventing it from being forwarded to the Anthropic API.
packages/data-designer-engine/src/data_designer/engine/models/clients/init.py Exports the new ChatCompletionChoice type alongside the existing public surface.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ModelFacade
    participant ModelClient
    participant Parser

    Caller->>ModelFacade: "completion(messages, n=4)"
    ModelFacade->>ModelFacade: _build_chat_completion_request (n included)
    ModelFacade->>ModelClient: "completion(ChatCompletionRequest[n=4])"
    ModelClient->>Parser: parse_chat_completion_response(raw)
    Parser->>Parser: normalize_choice_list(raw["choices"])
    loop for each choice [0..n-1]
        Parser->>Parser: parse_chat_completion_choice(choice)
    end
    Parser-->>ModelClient: "ChatCompletionResponse(message=choices[0].message, choices=[...])"
    ModelClient-->>ModelFacade: ChatCompletionResponse
    ModelFacade-->>Caller: ChatCompletionResponse

    Caller->>ModelFacade: "generate(prompt, n=4)"
    ModelFacade->>ModelFacade: completion_kwargs.pop("n", None)
    ModelFacade->>ModelFacade: completion(messages) [no n]
    ModelFacade-->>Caller: (parsed_result, messages)

    Note over ModelFacade: Anthropic adapter also strips n from the outgoing HTTP body
Loading

Reviews (4): Last reviewed commit: "Merge branch 'main' into nmulepati/fix-6..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown
Contributor

Review: PR #672 — fix: restore chat completion multi-choice support

Summary

The PR restores chat-completion multi-choice support (issue #620) by:

  • Adding n to ChatCompletionRequest and forwarding it through ModelFacade.completion() and OpenAI-compatible transport bodies; explicitly excluded from Anthropic forwarding.
  • Adding a new ChatCompletionChoice dataclass and a choices: list[ChatCompletionChoice] field on ChatCompletionResponse. response.message is preserved as the first choice for backward compatibility via __post_init__.
  • Refactoring parse_chat_completion_response / aparse_chat_completion_response to iterate every returned choice (instead of discarding everything after index 0), with shared helpers parse_chat_completion_choice, parse_assistant_message, parse_choice_index, parse_choice_finish_reason, and normalize_choice_list.
  • Adding tests for request n forwarding, multi-choice parsing, single-choice compatibility, and generate(..., n=...) forwarding.

The change is well-scoped and backward-compatible. Existing single-choice callers continue to work via response.message; new callers can iterate response.choices.

Findings

Correctness

  • generate(..., n=...) is a footgun. The n keyword is now on the _COMPLETION_REQUEST_FIELDS allowlist (facade.py:83), so callers can pass it to generate() — and the test test_generate_forwards_n_to_completion exercises that. But generate() only ever consumes completion_response.message.content (facade.py:372, facade.py:475); the remaining n-1 choices are silently dropped. Callers pay for output tokens for no benefit. Options worth considering:
    • Reject n>1 in generate() with a clear error, or
    • Document explicitly in the docstring that n is forwarded but only the first choice is parsed (the PR description notes this, but the public API doesn't), or
    • Pop n out of kwargs in generate() before delegating, since generate semantics don't currently accommodate it.
  • Anthropic silently drops n. n is added to AnthropicClient._TRANSPORT_EXCLUDE (correct — Anthropic Messages API doesn't accept it), so a caller who passes n=4 to an Anthropic-backed model gets a single response with no warning. Consider logging at debug or warning level when n>1 is supplied to a provider that doesn't honor it; otherwise this is an easy-to-miss surprise for users switching providers.
  • generated_images semantics changed for multi-choice. Previously extract_usage received len(images) from the first choice only; now it receives sum(len(c.message.images) for c in choices) (parsing.py:41, parsing.py:55). Identical for n=1 (the only currently exercised path) but worth confirming this aggregation is what billing/usage downstream expects when multi-choice image generation is added.
  • No validation of n. A caller can pass n=0 or a negative integer; the dataclass accepts it and forwards. Fine if the upstream provider is the source of truth, but a small __post_init__ check on ChatCompletionRequest would fail fast.

API design

  • ChatCompletionResponse.message and choices[0].message can drift. If both are passed at construction (as some external callers might), __post_init__ only backfills choices when empty — it does not assert consistency. The internal parsers always keep them in sync, but a stricter invariant (assert/normalize) would prevent subtle bugs later. Lightweight: in __post_init__, after backfill, assert self.message is self.choices[0].message.
  • messages property name collides conceptually with the request messages field. ChatCompletionResponse.messages returns a list[AssistantMessage], while ChatCompletionRequest.messages is the conversation history. assistant_messages or just steering callers toward choices would reduce ambiguity. Minor.

Tests

  • Coverage of the sync path is solid: request forwarding (test_chat_completion_request_n_is_forwarded_into_body), multi-choice parsing (test_parse_chat_completion_response_preserves_all_choices), backward-compat single-message (test_chat_completion_response_exposes_choices_for_single_message), and facade forwarding (test_completion_forwards_n_to_request, test_generate_forwards_n_to_completion).
  • No async multi-choice parsing test. aparse_chat_completion_response and aparse_chat_completion_choice mirror the sync logic but are uncovered. A 1-line async equivalent of test_parse_chat_completion_response_preserves_all_choices would close that gap.
  • No test asserting n is excluded from Anthropic payloads. Adding n to _TRANSPORT_EXCLUDE is a correctness-critical change (Anthropic would 400 on it); a test that builds an AnthropicClient payload from a request with n=4 and asserts "n" not in payload would lock that in.
  • No test for normalize_choice_list edge cases. Specifically, the "single non-list choice" branch (return [raw_choices]) is untested. Worth a unit test if any provider actually returns that shape — otherwise consider deleting the branch and just handling None + list.

Style / conventions

  • Follows project conventions: from __future__ import annotations, modern type syntax (int | None), absolute imports, no comments on obvious code. Good.
  • parse_chat_completion_choice and aparse_chat_completion_choice differ only in sync vs async image extraction — acceptable duplication given the project's existing sync/async split pattern (extract_images_from_chat_message vs aextract_images_from_chat_message).

Performance / security

  • No performance impact — multi-choice parsing iterates choices once. No new I/O.
  • No security implications.

Structural Impact

No pre-computed structural impact analysis was available (/tmp/structural-impact-672.md not present). Manual assessment: the change is contained to data_designer.engine.models.clients and data_designer.engine.models.facade. Import direction (interface → engine → config) is preserved; no new cross-package edges. The ChatCompletionResponse shape change is additive (new field with default factory + backfilling __post_init__), so external consumers reading response.message are unaffected. Risk: low.

Verdict

Approve with minor follow-ups. The core change is clean, backward-compatible, and well-tested for the sync OpenAI-compatible path. The two items I'd most like to see addressed before merge:

  1. Decide on the generate(..., n=...) story — either reject, document, or strip n in generate(). Tests currently encode the silent-drop behavior, which makes it harder to change later.
  2. Add a test asserting n is excluded from the Anthropic payload, since that's the only thing preventing a 400 when an Anthropic-backed model is used.

Nice-to-have: async multi-choice parsing test, soft validation of n>=1, and a clarifying note on ChatCompletionResponse.messages vs choices.

@nabinchha
Copy link
Copy Markdown
Contributor Author

Addressed the feedback in b2e22f7f:

  • generate() and agenerate() now strip n before delegating to completion, since those APIs only expose one parsed result.
  • Replaced the generate(..., n=...) forwarding test with sync/async tests that assert n is dropped.
  • Added async multi-choice parsing coverage.
  • Added Anthropic coverage confirming n is excluded from the payload.

Validation:

  • PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-engine/tests/engine/models -q532 passed
  • uv run --group dev ruff check ... on touched files → passed

nabinchha added 2 commits May 18, 2026 12:19
Restore the chat completion n request field and preserve all returned choices in the canonical response while keeping response.message as the first choice.

Add coverage for request forwarding, compatibility access, multi-choice parsing, and generate forwarding.

Fixes #620

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
Prevent generate and agenerate from forwarding multi-choice requests that they cannot expose, while keeping completion() multi-choice support intact.

Add coverage for async parsing and Anthropic n exclusion.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
@nabinchha nabinchha force-pushed the nmulepati/fix-620-chat-completion-choices-n branch from b2e22f7 to a0e0fc0 Compare May 18, 2026 18:20
Copy link
Copy Markdown
Contributor

@johnnygreco johnnygreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one remaining issue after the direct n stripping update: generate()/agenerate() can still forward n when it comes from model/provider configuration or nested extra_body, so the single-result APIs may still request multiple choices and discard all but the first.

Validation I ran locally on the reviewed diff:

  • uv run pytest packages/data-designer-engine/tests/engine/models/clients/test_parsing.py packages/data-designer-engine/tests/engine/models/clients/test_openai_compatible.py packages/data-designer-engine/tests/engine/models/clients/test_anthropic.py packages/data-designer-engine/tests/engine/models/test_facade.py -> 188 passed
  • git diff --check 71997624b31045c8b0268055f4a14eb02acce905 -> passed


while True:
completion_kwargs = dict(kwargs)
completion_kwargs.pop("n", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] This only removes a top-level kwarg before calling completion(). completion() immediately re-applies model/provider defaults via consolidate_kwargs(), and TransportKwargs flattens extra_body, so n from inference_parameters.extra_body, provider extra_body, or caller extra_body={"n": 4} is still sent on generate()/agenerate() and extra choices get discarded. Please clear n after consolidation or otherwise strip it from nested/configured extra_body for both generate paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ChatCompletionResponse lost .choices and ChatCompletionRequest lost n field

2 participants