fix: restore chat completion multi-choice support by nabinchha · Pull Request #672 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-05-18T16:27:06Z

📋 Summary

Restores chat completion multi-choice compatibility by reintroducing the n request field and preserving all returned choices in the canonical response. Existing single-choice callers can continue using response.message, while callers that request multiple completions can use response.choices.

🔗 Related Issue

Fixes #620

🔄 Changes

Add ChatCompletionRequest.n and export a new ChatCompletionChoice response type.
Preserve every returned chat completion choice in ChatCompletionResponse.choices, while keeping response.message as the first choice for existing callers.
Parse all OpenAI-compatible chat completion choices instead of discarding everything after index 0.
Forward n through ModelFacade.completion() and OpenAI-compatible transport bodies, while excluding it from Anthropic request forwarding.
Strip n from generate() and agenerate() before delegating to completion because those APIs return one parsed result.
Add tests for request forwarding, multi-choice parsing, async parsing, compatibility access, Anthropic n exclusion, and generate(..., n=...) stripping.

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

types.py — updates the canonical chat completion response contract while preserving response.message.
facade.py — completion(..., n=...) exposes multiple choices, while generate(..., n=...) strips n because it returns a single parsed result.

🧪 Testing

make test passes (not run; focused model suite was run instead)
PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-engine/tests/engine/models -q passes (532 passed)
uv run --group dev ruff check ... passes for touched files
Unit tests added/updated
E2E tests added/updated (N/A — no E2E surface)

✅ Checklist

Follows commit message conventions
Commits are signed off (DCO)
Architecture docs updated (N/A — no architecture docs needed)

greptile-apps · 2026-05-18T16:29:48Z

Greptile Summary

This PR restores multi-choice support for chat completions by reintroducing the n request field and wiring it through the entire stack — from ChatCompletionRequest through transport, response parsing, and the public ChatCompletionResponse shape. Existing single-choice callers are unaffected because response.message continues to hold the first choice, while new callers can iterate response.choices.

types.py: Adds ChatCompletionChoice, extends ChatCompletionResponse with a choices list and a messages property, and uses __post_init__ to guarantee backward compatibility when choices is omitted.
parsing.py: Refactors both the sync and async parsers to iterate over all returned choices via new helpers (normalize_choice_list, parse_chat_completion_choice, etc.) instead of extracting only index 0.
facade.py / adapters/anthropic.py: Forwards n through completion() and OpenAI-compatible transports while explicitly stripping it from generate()/agenerate() (single-result APIs) and from Anthropic requests (unsupported field).

Confidence Score: 5/5

Safe to merge — backward compatibility is maintained through response.message and the __post_init__ guard, all three transport paths (OpenAI-compatible, Anthropic, facade) are correctly wired, and the new behavior is well-covered by tests.

The change is well-scoped: each modified function has a clear before/after, the n-stripping in generate()/agenerate() is correct and tested, and Anthropic exclusion is explicitly verified. No regressions were found in the parsing, transport, or facade layers.

No files require special attention.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/types.py	Adds `ChatCompletionChoice` dataclass and extends `ChatCompletionResponse` with a `choices` list and a `messages` property; backward compat preserved via `__post_init__` guard.
packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py	Refactors response parsing to iterate over all choices instead of only the first; adds `normalize_choice_list`, `parse_chat_completion_choice`, and async counterparts.
packages/data-designer-engine/src/data_designer/engine/models/facade.py	Adds `n` to `_COMPLETION_REQUEST_FIELDS` so it forwards through `completion()`; strips `n` inside `generate()`/`agenerate()` loops because those methods return a single parsed result.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/anthropic.py	Adds `n` to the Anthropic adapter's excluded-fields list, correctly preventing it from being forwarded to the Anthropic API.
packages/data-designer-engine/src/data_designer/engine/models/clients/init.py	Exports the new `ChatCompletionChoice` type alongside the existing public surface.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ModelFacade
    participant ModelClient
    participant Parser

    Caller->>ModelFacade: "completion(messages, n=4)"
    ModelFacade->>ModelFacade: _build_chat_completion_request (n included)
    ModelFacade->>ModelClient: "completion(ChatCompletionRequest[n=4])"
    ModelClient->>Parser: parse_chat_completion_response(raw)
    Parser->>Parser: normalize_choice_list(raw["choices"])
    loop for each choice [0..n-1]
        Parser->>Parser: parse_chat_completion_choice(choice)
    end
    Parser-->>ModelClient: "ChatCompletionResponse(message=choices[0].message, choices=[...])"
    ModelClient-->>ModelFacade: ChatCompletionResponse
    ModelFacade-->>Caller: ChatCompletionResponse

    Caller->>ModelFacade: "generate(prompt, n=4)"
    ModelFacade->>ModelFacade: completion_kwargs.pop("n", None)
    ModelFacade->>ModelFacade: completion(messages) [no n]
    ModelFacade-->>Caller: (parsed_result, messages)

    Note over ModelFacade: Anthropic adapter also strips n from the outgoing HTTP body

_{Reviews (4): Last reviewed commit: "Merge branch 'main' into nmulepati/fix-6..." | Re-trigger Greptile}

github-actions · 2026-05-18T16:34:15Z

Review: PR #672 — fix: restore chat completion multi-choice support

Summary

The PR restores chat-completion multi-choice support (issue #620) by:

Adding n to ChatCompletionRequest and forwarding it through ModelFacade.completion() and OpenAI-compatible transport bodies; explicitly excluded from Anthropic forwarding.
Adding a new ChatCompletionChoice dataclass and a choices: list[ChatCompletionChoice] field on ChatCompletionResponse. response.message is preserved as the first choice for backward compatibility via __post_init__.
Refactoring parse_chat_completion_response / aparse_chat_completion_response to iterate every returned choice (instead of discarding everything after index 0), with shared helpers parse_chat_completion_choice, parse_assistant_message, parse_choice_index, parse_choice_finish_reason, and normalize_choice_list.
Adding tests for request n forwarding, multi-choice parsing, single-choice compatibility, and generate(..., n=...) forwarding.

The change is well-scoped and backward-compatible. Existing single-choice callers continue to work via response.message; new callers can iterate response.choices.

Findings

Correctness

generate(..., n=...) is a footgun. The n keyword is now on the _COMPLETION_REQUEST_FIELDS allowlist (facade.py:83), so callers can pass it to generate() — and the test test_generate_forwards_n_to_completion exercises that. But generate() only ever consumes completion_response.message.content (facade.py:372, facade.py:475); the remaining n-1 choices are silently dropped. Callers pay for n× output tokens for no benefit. Options worth considering:
- Reject n>1 in generate() with a clear error, or
- Document explicitly in the docstring that n is forwarded but only the first choice is parsed (the PR description notes this, but the public API doesn't), or
- Pop n out of kwargs in generate() before delegating, since generate semantics don't currently accommodate it.
Anthropic silently drops n. n is added to AnthropicClient._TRANSPORT_EXCLUDE (correct — Anthropic Messages API doesn't accept it), so a caller who passes n=4 to an Anthropic-backed model gets a single response with no warning. Consider logging at debug or warning level when n>1 is supplied to a provider that doesn't honor it; otherwise this is an easy-to-miss surprise for users switching providers.
generated_images semantics changed for multi-choice. Previously extract_usage received len(images) from the first choice only; now it receives sum(len(c.message.images) for c in choices) (parsing.py:41, parsing.py:55). Identical for n=1 (the only currently exercised path) but worth confirming this aggregation is what billing/usage downstream expects when multi-choice image generation is added.
No validation of n. A caller can pass n=0 or a negative integer; the dataclass accepts it and forwards. Fine if the upstream provider is the source of truth, but a small __post_init__ check on ChatCompletionRequest would fail fast.

API design

ChatCompletionResponse.message and choices[0].message can drift. If both are passed at construction (as some external callers might), __post_init__ only backfills choices when empty — it does not assert consistency. The internal parsers always keep them in sync, but a stricter invariant (assert/normalize) would prevent subtle bugs later. Lightweight: in __post_init__, after backfill, assert self.message is self.choices[0].message.
messages property name collides conceptually with the request messages field. ChatCompletionResponse.messages returns a list[AssistantMessage], while ChatCompletionRequest.messages is the conversation history. assistant_messages or just steering callers toward choices would reduce ambiguity. Minor.

Tests

Coverage of the sync path is solid: request forwarding (test_chat_completion_request_n_is_forwarded_into_body), multi-choice parsing (test_parse_chat_completion_response_preserves_all_choices), backward-compat single-message (test_chat_completion_response_exposes_choices_for_single_message), and facade forwarding (test_completion_forwards_n_to_request, test_generate_forwards_n_to_completion).
No async multi-choice parsing test. aparse_chat_completion_response and aparse_chat_completion_choice mirror the sync logic but are uncovered. A 1-line async equivalent of test_parse_chat_completion_response_preserves_all_choices would close that gap.
No test asserting n is excluded from Anthropic payloads. Adding n to _TRANSPORT_EXCLUDE is a correctness-critical change (Anthropic would 400 on it); a test that builds an AnthropicClient payload from a request with n=4 and asserts "n" not in payload would lock that in.
No test for normalize_choice_list edge cases. Specifically, the "single non-list choice" branch (return [raw_choices]) is untested. Worth a unit test if any provider actually returns that shape — otherwise consider deleting the branch and just handling None + list.

Style / conventions

Follows project conventions: from __future__ import annotations, modern type syntax (int | None), absolute imports, no comments on obvious code. Good.
parse_chat_completion_choice and aparse_chat_completion_choice differ only in sync vs async image extraction — acceptable duplication given the project's existing sync/async split pattern (extract_images_from_chat_message vs aextract_images_from_chat_message).

Performance / security

No performance impact — multi-choice parsing iterates choices once. No new I/O.
No security implications.

Structural Impact

No pre-computed structural impact analysis was available (/tmp/structural-impact-672.md not present). Manual assessment: the change is contained to data_designer.engine.models.clients and data_designer.engine.models.facade. Import direction (interface → engine → config) is preserved; no new cross-package edges. The ChatCompletionResponse shape change is additive (new field with default factory + backfilling __post_init__), so external consumers reading response.message are unaffected. Risk: low.

Verdict

Approve with minor follow-ups. The core change is clean, backward-compatible, and well-tested for the sync OpenAI-compatible path. The two items I'd most like to see addressed before merge:

Decide on the generate(..., n=...) story — either reject, document, or strip n in generate(). Tests currently encode the silent-drop behavior, which makes it harder to change later.
Add a test asserting n is excluded from the Anthropic payload, since that's the only thing preventing a 400 when an Anthropic-backed model is used.

Nice-to-have: async multi-choice parsing test, soft validation of n>=1, and a clarifying note on ChatCompletionResponse.messages vs choices.

nabinchha · 2026-05-18T17:44:13Z

Addressed the feedback in b2e22f7f:

generate() and agenerate() now strip n before delegating to completion, since those APIs only expose one parsed result.
Replaced the generate(..., n=...) forwarding test with sync/async tests that assert n is dropped.
Added async multi-choice parsing coverage.
Added Anthropic coverage confirming n is excluded from the payload.

Validation:

PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-engine/tests/engine/models -q → 532 passed
uv run --group dev ruff check ... on touched files → passed

Restore the chat completion n request field and preserve all returned choices in the canonical response while keeping response.message as the first choice. Add coverage for request forwarding, compatibility access, multi-choice parsing, and generate forwarding. Fixes #620 Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

Prevent generate and agenerate from forwarding multi-choice requests that they cannot expose, while keeping completion() multi-choice support intact. Add coverage for async parsing and Anthropic n exclusion. Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

johnnygreco

I found one remaining issue after the direct n stripping update: generate()/agenerate() can still forward n when it comes from model/provider configuration or nested extra_body, so the single-result APIs may still request multiple choices and discard all but the first.

Validation I ran locally on the reviewed diff:

uv run pytest packages/data-designer-engine/tests/engine/models/clients/test_parsing.py packages/data-designer-engine/tests/engine/models/clients/test_openai_compatible.py packages/data-designer-engine/tests/engine/models/clients/test_anthropic.py packages/data-designer-engine/tests/engine/models/test_facade.py -> 188 passed
git diff --check 71997624b31045c8b0268055f4a14eb02acce905 -> passed

johnnygreco · 2026-05-19T22:07:53Z


        while True:
            completion_kwargs = dict(kwargs)
+            completion_kwargs.pop("n", None)


[P2] This only removes a top-level kwarg before calling completion(). completion() immediately re-applies model/provider defaults via consolidate_kwargs(), and TransportKwargs flattens extra_body, so n from inference_parameters.extra_body, provider extra_body, or caller extra_body={"n": 4} is still sent on generate()/agenerate() and extra choices get discarded. Please clear n after consolidation or otherwise strip it from nested/configured extra_body for both generate paths.

nabinchha requested a review from a team as a code owner May 18, 2026 16:27

nabinchha had a problem deploying to agentic-ci May 18, 2026 16:27 — with GitHub Actions Failure

nabinchha temporarily deployed to agentic-ci May 18, 2026 16:31 — with GitHub Actions Inactive

nabinchha added 2 commits May 18, 2026 12:19

nabinchha force-pushed the nmulepati/fix-620-chat-completion-choices-n branch from b2e22f7 to a0e0fc0 Compare May 18, 2026 18:20

Merge branch 'main' into nmulepati/fix-620-chat-completion-choices-n

b1f0235

johnnygreco reviewed May 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: restore chat completion multi-choice support#672

fix: restore chat completion multi-choice support#672
nabinchha wants to merge 3 commits into
mainfrom
nmulepati/fix-620-chat-completion-choices-n

nabinchha commented May 18, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 18, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

nabinchha commented May 18, 2026

Uh oh!

johnnygreco left a comment

Uh oh!

johnnygreco May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nabinchha commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Summary

🔗 Related Issue

🔄 Changes

🔍 Attention Areas

🧪 Testing

✅ Checklist

Uh oh!

greptile-apps Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

github-actions Bot commented May 18, 2026

Review: PR #672 — fix: restore chat completion multi-choice support

Summary

Findings

Correctness

API design

Tests

Style / conventions

Performance / security

Structural Impact

Verdict

Uh oh!

nabinchha commented May 18, 2026

Uh oh!

johnnygreco left a comment

Choose a reason for hiding this comment

Uh oh!

johnnygreco May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nabinchha commented May 18, 2026 •

edited

Loading

greptile-apps Bot commented May 18, 2026 •

edited

Loading