Skip to content

fix: Add fallback for OpenAI-compatible APIs without Structured Outputs#990

Open
ambicuity wants to merge 2 commits intodatalab-to:masterfrom
ambicuity:fix/issue-985-openai-compatible-api-fallback
Open

fix: Add fallback for OpenAI-compatible APIs without Structured Outputs#990
ambicuity wants to merge 2 commits intodatalab-to:masterfrom
ambicuity:fix/issue-985-openai-compatible-api-fallback

Conversation

@ambicuity
Copy link

Summary

Fixes #985

When using OpenAI-compatible APIs (e.g., DeepSeek) that don't support the response_format parameter (Structured Outputs), the LLM step was failing with response_format type is unavailable error.

Changes

  • Add openai_disable_structured_output config option for users who know their API doesn't support Structured Outputs
    • Implement validate_response() method to parse JSON from plain text responses (pattern from ClaudeService)
    • Automatically detect when Structured Outputs fails (400 error) and fallback to plain chat.completions.create() with schema in the prompt

Backward Compatibility

Existing OpenAI API users are unaffected - Structured Outputs is still tried first. The fallback only activates when:

  1. User explicitly sets --openai_disable_structured_output, OR
    1. The API returns a 400 error indicating response_format is not supported

Usage

# For APIs that don't support Structured Outputs
marker_single document.pdf --use_llm --openai_disable_structured_output \
  --openai_base_url https://api.deepseek.com \
  --openai_model deepseek-reasoner \
  --openai_api_key YOUR_KEY

Fixes datalab-to#985

When using OpenAI-compatible APIs (e.g., DeepSeek) that don't support
the response_format parameter (Structured Outputs), the LLM step was
failing with 'response_format type is unavailable' error.

Changes:
- Add openai_disable_structured_output config option for users who
  know their API doesn't support Structured Outputs
- Implement validate_response() to parse JSON from plain text responses
- Automatically detect when Structured Outputs fails and fallback to
  plain chat.completions.create() with schema in the prompt
- Backward compatible: existing OpenAI API users are unaffected
Copilot AI review requested due to automatic review settings February 9, 2026 18:41
@ambicuity
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds resilience when using OpenAI-compatible providers that reject Structured Outputs (response_format) by introducing a configurable and automatic fallback to plain chat completions plus JSON parsing.

Changes:

  • Added openai_disable_structured_output config flag to force-disable Structured Outputs for incompatible providers.
  • Implemented a plain-text JSON parsing fallback (validate_response) and a fallback call path that embeds the schema in the prompt.
  • Added automatic fallback on certain 400 BadRequest errors indicating response_format is unsupported.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +208 to +210
not use_fallback
and "response_format" in error_msg
or "unavailable" in error_msg
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BadRequestError fallback condition has incorrect operator precedence: ... and "response_format" in error_msg or "unavailable" in error_msg will trigger whenever "unavailable" appears, even when use_fallback is already true. This can cause misleading logs and repeated retries in fallback mode for unrelated BadRequestError causes (e.g., model unavailable). Group the condition so the fallback only triggers when not already in fallback, e.g. if (not use_fallback) and ("response_format" in error_msg or "unavailable" in error_msg): (and consider tightening the "unavailable" match to response_format-specific messages).

Suggested change
not use_fallback
and "response_format" in error_msg
or "unavailable" in error_msg
(not use_fallback)
and (
"response_format" in error_msg
or "unavailable" in error_msg
)

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +100
def _call_with_structured_output(
self,
client: openai.OpenAI,
messages: list,
response_schema: type[BaseModel],
timeout: int,
) -> dict | None:
"""Call OpenAI API using Structured Outputs (response_format)."""
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_call_with_structured_output is annotated to return dict | None, but it actually returns a 2-tuple (parsed_result, total_tokens). This makes the helper misleading for callers/type checkers and inconsistent with how it’s used in __call__ via tuple-unpacking. Update the return type annotation (and docstring) to reflect the tuple return (e.g., tuple[dict | None, int]) and keep it consistent with _call_with_fallback.

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +121
def _call_with_fallback(
self,
client: openai.OpenAI,
messages: list,
response_schema: type[BaseModel],
timeout: int,
) -> dict | None:
"""
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_call_with_fallback is annotated to return dict | None, but it returns a 2-tuple (parsed, total_tokens) (see the return at the end of the method). Align the return type annotation (and docstring) with the actual return value (e.g., tuple[dict | None, int]) to avoid confusion and improve static analysis.

Copilot uses AI. Check for mistakes.
Comment on lines +179 to +216
# Track if we should use fallback mode
use_fallback = self.openai_disable_structured_output

total_tries = max_retries + 1
for tries in range(1, total_tries + 1):
try:
response = client.beta.chat.completions.parse(
extra_headers={
"X-Title": "Marker",
"HTTP-Referer": "https://github.com/datalab-to/marker",
},
model=self.openai_model,
messages=messages,
timeout=timeout,
response_format=response_schema,
)
response_text = response.choices[0].message.content
total_tokens = response.usage.total_tokens
if use_fallback:
result, total_tokens = self._call_with_fallback(
client, messages, response_schema, timeout
)
else:
result, total_tokens = self._call_with_structured_output(
client, messages, response_schema, timeout
)

if result is None:
logger.warning("LLM did not return a valid response")
return {}

if block:
block.update_metadata(
llm_tokens_used=total_tokens, llm_request_count=1
)
return json.loads(response_text)
return result

except BadRequestError as e:
# Check if this is a "response_format not supported" error
error_msg = str(e).lower()
if (
not use_fallback
and "response_format" in error_msg
or "unavailable" in error_msg
):
logger.warning(
f"Structured Outputs not supported by this API, falling back to plain completions: {e}"
)
use_fallback = True
# Retry immediately with fallback
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces new fallback behavior (disable structured outputs via config and automatic fallback on specific 400s), but there are no unit tests covering: (1) --openai_disable_structured_output forcing the plain-completions path, (2) a BadRequestError on structured output switching to fallback, and (3) validate_response() correctly parsing fenced JSON. Adding tests with a mocked OpenAI client would help prevent regressions across providers.

Copilot uses AI. Check for mistakes.
Addresses Copilot review feedback. Adds tests covering:
- validate_response() JSON parsing with various formats
- openai_disable_structured_output config option
- Automatic fallback on BadRequestError for unsupported response_format
@charles-wang888
Copy link

ok, verify correct, can mark it 'finished',thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG: Breaking] OpenAI-compatible API (DeepSeek) fails with "This response_format type is unavailable now" when using --use_llm

3 participants