Skip to content

Conversation

@ChiragAgg5k
Copy link
Member

@ChiragAgg5k ChiragAgg5k commented Oct 13, 2025

Summary by CodeRabbit

  • New Features

    • Enabled JSON Schema (structured output) support for the Perplexity integration, allowing schema-constrained responses in workflows and templates.
  • Bug Fixes

    • Improved HTML error handling: HTML error responses are now sanitized and presented as readable messages (including title/h1 and optional status) instead of raw HTML.

@coderabbitai
Copy link

coderabbitai bot commented Oct 13, 2025

Walkthrough

Perplexity adapter: isSchemaSupported() now returns true. Processing of chunks that begin with HTML now delegates to a new protected sanitizeHtmlError(string $html) method which extracts/formats error info (title, h1, optional status); previously raw HTML was returned.

Changes

Cohort / File(s) Summary of edits
Perplexity adapter — implementation updates
src/Agents/Adapters/Perplexity.php
- isSchemaSupported() return value changed from false to true.
- Replaced direct return of raw HTML when a chunk begins with HTML with a call to a new protected method sanitizeHtmlError(string $html): string.
- Added protected function sanitizeHtmlError(string $html): string which parses HTML (title, first H1, optional HTTP status) and produces a readable fallback message for unrecognized HTML.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant PerplexityAdapter as Perplexity
    participant HTMLParser as sanitizeHtmlError()

    Client->>PerplexityAdapter: send chunk (response)
    alt chunk starts with '<' (HTML)
        PerplexityAdapter->>HTMLParser: sanitizeHtmlError(html)
        HTMLParser-->>PerplexityAdapter: formatted error string
        PerplexityAdapter-->>Client: return readable error
    else non-HTML chunk
        PerplexityAdapter-->>Client: process/return chunk as before
    end
    Note over PerplexityAdapter: isSchemaSupported() now returns true
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I nibble bytes beneath the moon,
A tiny tweak, a gentle tune—
When HTML sneaks in to perplex,
I parse the title, hush the hex.
Hop! A fix, polite and neat,
My whiskers twitch — error now speaks sweet.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly and concisely describes the primary change of activating schema support in the Perplexity adapter, aligning directly with the code modifications that update isSchemaSupported() and related behavior without introducing unnecessary detail or noise.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enable-schema-support-perplexity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/Agents/Adapters/Perplexity.php (1)

74-77: Approve enabling JSON schema support for Perplexity
Verified that Perplexity’s API supports structured outputs via JSON Schema (response_format.type = "json_schema" with json_schema.schema). The change to isSchemaSupported() is correct. Consider adding an integration test to validate structured responses (accounting for initial schema prep delay and any <think> prefix).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d2f173 and 3eb9f98.

📒 Files selected for processing (1)
  • src/Agents/Adapters/Perplexity.php (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/Agents/Adapters/Perplexity.php (2)

125-132: Good: sanitize HTML error responses early; consider content-type check

This improves UX vs returning raw HTML. Optionally, also detect via Content-Type header (text/html) if available on Chunk to avoid false positives on unusual payloads.

Can Chunk expose headers (e.g., getHeaders/getHeader)? If yes, we can branch on Content-Type first and keep the current tag-prefix check as fallback.


166-205: Improve HTML parsing and output consistency

Recommend small tweaks: decode HTML entities, broaden status parsing (allow “401 - …”/“401: …”), and avoid leading newline to keep error formatting consistent.

Confirm that removing the leading newline doesn’t break any tests/consumers expecting it.

Apply this diff:

 protected function sanitizeHtmlError(string $html): string
 {
-        // Try to extract the error from the title tag
-        if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) {
-            $errorMessage = trim($matches[1]);
-
-            // Extract status code and message if present (e.g., "401 Authorization Required")
-            if (preg_match('/^(\d{3})\s+(.+)$/i', $errorMessage, $parts)) {
-                $statusCode = $parts[1];
-                $message = $parts[2];
-
-                return PHP_EOL.'(http_'.$statusCode.') '.$message;
-            }
-
-            return PHP_EOL.'(html_error) '.$errorMessage;
-        }
+        // Try to extract the error from the title tag
+        if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) {
+            $errorMessage = html_entity_decode(trim(strip_tags($matches[1])), ENT_QUOTES | ENT_HTML5);
+
+            // Extract status code and message if present (e.g., "401 - Authorization Required" or "401: Authorization Required")
+            if (preg_match('/^(\d{3})\s*[-:—]?\s*(.+)$/i', $errorMessage, $parts)) {
+                $statusCode = $parts[1];
+                $message = $parts[2];
+
+                return '(http_'.$statusCode.') '.$message;
+            }
+
+            return '(html_error) '.$errorMessage;
+        }
 
         // Try to extract from h1 tag
-        if (preg_match('/<h1>(.*?)<\/h1>/is', $html, $matches)) {
-            $errorMessage = trim(strip_tags($matches[1]));
-
-            if (preg_match('/^(\d{3})\s+(.+)$/i', $errorMessage, $parts)) {
-                $statusCode = $parts[1];
-                $message = $parts[2];
-
-                return PHP_EOL.'(http_'.$statusCode.') '.$message;
-            }
-
-            return PHP_EOL.'(html_error) '.$errorMessage;
-        }
+        if (preg_match('/<h1>(.*?)<\/h1>/is', $html, $matches)) {
+            $errorMessage = html_entity_decode(trim(strip_tags($matches[1])), ENT_QUOTES | ENT_HTML5);
+
+            if (preg_match('/^(\d{3})\s*[-:—]?\s*(.+)$/i', $errorMessage, $parts)) {
+                $statusCode = $parts[1];
+                $message = $parts[2];
+
+                return '(http_'.$statusCode.') '.$message;
+            }
+
+            return '(html_error) '.$errorMessage;
+        }
 
         // Fallback for unrecognized HTML errors
-        return PHP_EOL.'(html_error) Received HTML error response from API';
+        return '(html_error) Received HTML error response from API';
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3eb9f98 and f9be080.

📒 Files selected for processing (1)
  • src/Agents/Adapters/Perplexity.php (3 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/Agents/Adapters/Perplexity.php (3)

175-176: Consider using strip_tags() on title content for consistency.

Line 191 uses strip_tags() when extracting from the <h1> tag, but line 176 doesn't apply the same treatment to the <title> content. If the title contains nested HTML tags, they would appear in the error message.

Apply this diff to maintain consistency:

         if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) {
-            $errorMessage = trim($matches[1]);
+            $errorMessage = trim(strip_tags($matches[1]));
 
             // Extract status code and message if present (e.g., "401 Authorization Required")

179-184: Optional: Extract duplicate status code parsing logic.

The status code extraction logic on lines 179-184 and 193-198 is identical. Consider extracting it into a helper method or restructuring to parse the status code once after extracting the error message from either title or h1.

Example refactor:

protected function sanitizeHtmlError(string $html): string
{
    $errorMessage = null;
    
    // Try to extract the error from the title tag
    if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) {
        $errorMessage = trim(strip_tags($matches[1]));
    }
    
    // Try to extract from h1 tag if not found
    if ($errorMessage === null && preg_match('/<h1>(.*?)<\/h1>/is', $html, $matches)) {
        $errorMessage = trim(strip_tags($matches[1]));
    }
    
    // Parse status code if message was found
    if ($errorMessage !== null) {
        if (preg_match('/^(\d{3})\s+(.+)$/i', $errorMessage, $parts)) {
            return '(http_'.$parts[1].') '.$parts[2];
        }
        return '(html_error) '.$errorMessage;
    }
    
    // Fallback for unrecognized HTML errors
    return '(html_error) Received HTML error response from API';
}

Also applies to: 193-198


74-77: JSON schema support confirmed; adjust parsing and add tests.

  • Perplexity supports structured JSON schema across all getModels() values.
  • Reasoning models (e.g., sonar-reasoning, sonar-reasoning-pro) emit a <think> prefix—strip this before JSON parsing.
  • Add an integration test invoking both standard and reasoning models to verify schema support.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3eb9f98 and 2c0c4da.

📒 Files selected for processing (1)
  • src/Agents/Adapters/Perplexity.php (3 hunks)
🔇 Additional comments (1)
src/Agents/Adapters/Perplexity.php (1)

131-131: LGTM! Improved error handling.

Delegating to sanitizeHtmlError() significantly improves the user experience by providing readable error messages instead of raw HTML.

@ChiragAgg5k ChiragAgg5k merged commit 667da99 into main Oct 13, 2025
4 of 6 checks passed
@ChiragAgg5k ChiragAgg5k deleted the enable-schema-support-perplexity branch October 13, 2025 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants