fix: Add fallback for OpenAI-compatible APIs without Structured Outputs by ambicuity · Pull Request #990 · datalab-to/marker

ambicuity · 2026-02-09T18:41:26Z

Summary

Fixes #985

When using OpenAI-compatible APIs (e.g., DeepSeek) that don't support the response_format parameter (Structured Outputs), the LLM step was failing with response_format type is unavailable error.

Changes

Add openai_disable_structured_output config option for users who know their API doesn't support Structured Outputs
- Implement validate_response() method to parse JSON from plain text responses (pattern from ClaudeService)
- Automatically detect when Structured Outputs fails (400 error) and fallback to plain chat.completions.create() with schema in the prompt

Backward Compatibility

Existing OpenAI API users are unaffected - Structured Outputs is still tried first. The fallback only activates when:

User explicitly sets --openai_disable_structured_output, OR
1. The API returns a 400 error indicating response_format is not supported

Usage

# For APIs that don't support Structured Outputs
marker_single document.pdf --use_llm --openai_disable_structured_output \
  --openai_base_url https://api.deepseek.com \
  --openai_model deepseek-reasoner \
  --openai_api_key YOUR_KEY

Fixes datalab-to#985 When using OpenAI-compatible APIs (e.g., DeepSeek) that don't support the response_format parameter (Structured Outputs), the LLM step was failing with 'response_format type is unavailable' error. Changes: - Add openai_disable_structured_output config option for users who know their API doesn't support Structured Outputs - Implement validate_response() to parse JSON from plain text responses - Automatically detect when Structured Outputs fails and fallback to plain chat.completions.create() with schema in the prompt - Backward compatible: existing OpenAI API users are unaffected

ambicuity · 2026-02-09T18:42:30Z

I have read the CLA Document and I hereby sign the CLA

Copilot

Pull request overview

Adds resilience when using OpenAI-compatible providers that reject Structured Outputs (response_format) by introducing a configurable and automatic fallback to plain chat completions plus JSON parsing.

Changes:

Added openai_disable_structured_output config flag to force-disable Structured Outputs for incompatible providers.
Implemented a plain-text JSON parsing fallback (validate_response) and a fallback call path that embeds the schema in the prompt.
Added automatic fallback on certain 400 BadRequest errors indicating response_format is unsupported.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T18:45:03Z

marker/services/openai.py

+                    not use_fallback
+                    and "response_format" in error_msg
+                    or "unavailable" in error_msg


The BadRequestError fallback condition has incorrect operator precedence: ... and "response_format" in error_msg or "unavailable" in error_msg will trigger whenever "unavailable" appears, even when use_fallback is already true. This can cause misleading logs and repeated retries in fallback mode for unrelated BadRequestError causes (e.g., model unavailable). Group the condition so the fallback only triggers when not already in fallback, e.g. if (not use_fallback) and ("response_format" in error_msg or "unavailable" in error_msg): (and consider tightening the "unavailable" match to response_format-specific messages).

Suggested change

not use_fallback

and "response_format" in error_msg

or "unavailable" in error_msg

(not use_fallback)

and (

"response_format" in error_msg

or "unavailable" in error_msg

)

Copilot · 2026-02-09T18:45:04Z

marker/services/openai.py

+    def _call_with_structured_output(
+        self,
+        client: openai.OpenAI,
+        messages: list,
+        response_schema: type[BaseModel],
+        timeout: int,
+    ) -> dict | None:
+        """Call OpenAI API using Structured Outputs (response_format)."""


_call_with_structured_output is annotated to return dict | None, but it actually returns a 2-tuple (parsed_result, total_tokens). This makes the helper misleading for callers/type checkers and inconsistent with how it’s used in __call__ via tuple-unpacking. Update the return type annotation (and docstring) to reflect the tuple return (e.g., tuple[dict | None, int]) and keep it consistent with _call_with_fallback.

Copilot · 2026-02-09T18:45:04Z

marker/services/openai.py

+    def _call_with_fallback(
+        self,
+        client: openai.OpenAI,
+        messages: list,
+        response_schema: type[BaseModel],
+        timeout: int,
+    ) -> dict | None:
+        """


_call_with_fallback is annotated to return dict | None, but it returns a 2-tuple (parsed, total_tokens) (see the return at the end of the method). Align the return type annotation (and docstring) with the actual return value (e.g., tuple[dict | None, int]) to avoid confusion and improve static analysis.

Copilot · 2026-02-09T18:45:04Z

marker/services/openai.py

+        # Track if we should use fallback mode
+        use_fallback = self.openai_disable_structured_output
+
        total_tries = max_retries + 1
        for tries in range(1, total_tries + 1):
            try:
-                response = client.beta.chat.completions.parse(
-                    extra_headers={
-                        "X-Title": "Marker",
-                        "HTTP-Referer": "https://github.com/datalab-to/marker",
-                    },
-                    model=self.openai_model,
-                    messages=messages,
-                    timeout=timeout,
-                    response_format=response_schema,
-                )
-                response_text = response.choices[0].message.content
-                total_tokens = response.usage.total_tokens
+                if use_fallback:
+                    result, total_tokens = self._call_with_fallback(
+                        client, messages, response_schema, timeout
+                    )
+                else:
+                    result, total_tokens = self._call_with_structured_output(
+                        client, messages, response_schema, timeout
+                    )
+
+                if result is None:
+                    logger.warning("LLM did not return a valid response")
+                    return {}
+
                if block:
                    block.update_metadata(
                        llm_tokens_used=total_tokens, llm_request_count=1
                    )
-                return json.loads(response_text)
+                return result
+
+            except BadRequestError as e:
+                # Check if this is a "response_format not supported" error
+                error_msg = str(e).lower()
+                if (
+                    not use_fallback
+                    and "response_format" in error_msg
+                    or "unavailable" in error_msg
+                ):
+                    logger.warning(
+                        f"Structured Outputs not supported by this API, falling back to plain completions: {e}"
+                    )
+                    use_fallback = True
+                    # Retry immediately with fallback


This PR introduces new fallback behavior (disable structured outputs via config and automatic fallback on specific 400s), but there are no unit tests covering: (1) --openai_disable_structured_output forcing the plain-completions path, (2) a BadRequestError on structured output switching to fallback, and (3) validate_response() correctly parsing fenced JSON. Adding tests with a mocked OpenAI client would help prevent regressions across providers.

Addresses Copilot review feedback. Adds tests covering: - validate_response() JSON parsing with various formats - openai_disable_structured_output config option - Automatic fallback on BadRequestError for unsupported response_format

charles-wang888 · 2026-02-11T01:26:08Z

ok, verify correct, can mark it 'finished',thx

Copilot AI review requested due to automatic review settings February 9, 2026 18:41

Copilot started reviewing on behalf of ambicuity February 9, 2026 18:41 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

test: Add unit tests for OpenAI fallback mechanism

771b827

Addresses Copilot review feedback. Adds tests covering: - validate_response() JSON parsing with various formats - openai_disable_structured_output config option - Automatic fallback on BadRequestError for unsupported response_format

ambicuity mentioned this pull request Feb 10, 2026

[BUG: Breaking] OpenAI-compatible API (DeepSeek) fails with "This response_format type is unavailable now" when using --use_llm #985

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Add fallback for OpenAI-compatible APIs without Structured Outputs#990

fix: Add fallback for OpenAI-compatible APIs without Structured Outputs#990
ambicuity wants to merge 2 commits intodatalab-to:masterfrom
ambicuity:fix/issue-985-openai-compatible-api-fallback

ambicuity commented Feb 9, 2026

Uh oh!

ambicuity commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

charles-wang888 commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ambicuity commented Feb 9, 2026

Summary

Changes

Backward Compatibility

Usage

Uh oh!

ambicuity commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

charles-wang888 commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants