-
Notifications
You must be signed in to change notification settings - Fork 0
chore: enable schema support for perplexity #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughPerplexity adapter: isSchemaSupported() now returns true. Processing of chunks that begin with HTML now delegates to a new protected sanitizeHtmlError(string $html) method which extracts/formats error info (title, h1, optional status); previously raw HTML was returned. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant PerplexityAdapter as Perplexity
participant HTMLParser as sanitizeHtmlError()
Client->>PerplexityAdapter: send chunk (response)
alt chunk starts with '<' (HTML)
PerplexityAdapter->>HTMLParser: sanitizeHtmlError(html)
HTMLParser-->>PerplexityAdapter: formatted error string
PerplexityAdapter-->>Client: return readable error
else non-HTML chunk
PerplexityAdapter-->>Client: process/return chunk as before
end
Note over PerplexityAdapter: isSchemaSupported() now returns true
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/Agents/Adapters/Perplexity.php (1)
74-77: Approve enabling JSON schema support for Perplexity
Verified that Perplexity’s API supports structured outputs via JSON Schema (response_format.type = "json_schema"withjson_schema.schema). The change toisSchemaSupported()is correct. Consider adding an integration test to validate structured responses (accounting for initial schema prep delay and any<think>prefix).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/Agents/Adapters/Perplexity.php (2)
125-132: Good: sanitize HTML error responses early; consider content-type checkThis improves UX vs returning raw HTML. Optionally, also detect via Content-Type header (text/html) if available on Chunk to avoid false positives on unusual payloads.
Can Chunk expose headers (e.g., getHeaders/getHeader)? If yes, we can branch on Content-Type first and keep the current tag-prefix check as fallback.
166-205: Improve HTML parsing and output consistencyRecommend small tweaks: decode HTML entities, broaden status parsing (allow “401 - …”/“401: …”), and avoid leading newline to keep error formatting consistent.
Confirm that removing the leading newline doesn’t break any tests/consumers expecting it.
Apply this diff:
protected function sanitizeHtmlError(string $html): string { - // Try to extract the error from the title tag - if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) { - $errorMessage = trim($matches[1]); - - // Extract status code and message if present (e.g., "401 Authorization Required") - if (preg_match('/^(\d{3})\s+(.+)$/i', $errorMessage, $parts)) { - $statusCode = $parts[1]; - $message = $parts[2]; - - return PHP_EOL.'(http_'.$statusCode.') '.$message; - } - - return PHP_EOL.'(html_error) '.$errorMessage; - } + // Try to extract the error from the title tag + if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) { + $errorMessage = html_entity_decode(trim(strip_tags($matches[1])), ENT_QUOTES | ENT_HTML5); + + // Extract status code and message if present (e.g., "401 - Authorization Required" or "401: Authorization Required") + if (preg_match('/^(\d{3})\s*[-:—]?\s*(.+)$/i', $errorMessage, $parts)) { + $statusCode = $parts[1]; + $message = $parts[2]; + + return '(http_'.$statusCode.') '.$message; + } + + return '(html_error) '.$errorMessage; + } // Try to extract from h1 tag - if (preg_match('/<h1>(.*?)<\/h1>/is', $html, $matches)) { - $errorMessage = trim(strip_tags($matches[1])); - - if (preg_match('/^(\d{3})\s+(.+)$/i', $errorMessage, $parts)) { - $statusCode = $parts[1]; - $message = $parts[2]; - - return PHP_EOL.'(http_'.$statusCode.') '.$message; - } - - return PHP_EOL.'(html_error) '.$errorMessage; - } + if (preg_match('/<h1>(.*?)<\/h1>/is', $html, $matches)) { + $errorMessage = html_entity_decode(trim(strip_tags($matches[1])), ENT_QUOTES | ENT_HTML5); + + if (preg_match('/^(\d{3})\s*[-:—]?\s*(.+)$/i', $errorMessage, $parts)) { + $statusCode = $parts[1]; + $message = $parts[2]; + + return '(http_'.$statusCode.') '.$message; + } + + return '(html_error) '.$errorMessage; + } // Fallback for unrecognized HTML errors - return PHP_EOL.'(html_error) Received HTML error response from API'; + return '(html_error) Received HTML error response from API'; }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/Agents/Adapters/Perplexity.php (3)
175-176: Consider usingstrip_tags()on title content for consistency.Line 191 uses
strip_tags()when extracting from the<h1>tag, but line 176 doesn't apply the same treatment to the<title>content. If the title contains nested HTML tags, they would appear in the error message.Apply this diff to maintain consistency:
if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) { - $errorMessage = trim($matches[1]); + $errorMessage = trim(strip_tags($matches[1])); // Extract status code and message if present (e.g., "401 Authorization Required")
179-184: Optional: Extract duplicate status code parsing logic.The status code extraction logic on lines 179-184 and 193-198 is identical. Consider extracting it into a helper method or restructuring to parse the status code once after extracting the error message from either title or h1.
Example refactor:
protected function sanitizeHtmlError(string $html): string { $errorMessage = null; // Try to extract the error from the title tag if (preg_match('/<title>(.*?)<\/title>/is', $html, $matches)) { $errorMessage = trim(strip_tags($matches[1])); } // Try to extract from h1 tag if not found if ($errorMessage === null && preg_match('/<h1>(.*?)<\/h1>/is', $html, $matches)) { $errorMessage = trim(strip_tags($matches[1])); } // Parse status code if message was found if ($errorMessage !== null) { if (preg_match('/^(\d{3})\s+(.+)$/i', $errorMessage, $parts)) { return '(http_'.$parts[1].') '.$parts[2]; } return '(html_error) '.$errorMessage; } // Fallback for unrecognized HTML errors return '(html_error) Received HTML error response from API'; }Also applies to: 193-198
74-77: JSON schema support confirmed; adjust parsing and add tests.
- Perplexity supports structured JSON schema across all
getModels()values.- Reasoning models (e.g., sonar-reasoning, sonar-reasoning-pro) emit a
<think>prefix—strip this before JSON parsing.- Add an integration test invoking both standard and reasoning models to verify schema support.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Agents/Adapters/Perplexity.php(3 hunks)
🔇 Additional comments (1)
src/Agents/Adapters/Perplexity.php (1)
131-131: LGTM! Improved error handling.Delegating to
sanitizeHtmlError()significantly improves the user experience by providing readable error messages instead of raw HTML.
Summary by CodeRabbit
New Features
Bug Fixes