feat(capture): support OpenRouter as alternative vision provider for image captioning#840
feat(capture): support OpenRouter as alternative vision provider for image captioning#840CodAr-man wants to merge 1 commit into
Conversation
Adds OpenRouter alongside Gemini. Provider selected by API key at runtime: - OPENROUTER_API_KEY set: uses google/gemma-4-26b-a4b-it via OpenRouter - GEMINI_API_KEY set: existing Gemini behavior (unchanged) - Neither: existing DOM-only behavior (unchanged) Purely additive - Gemini code path untouched.
|
Heads up — GitHub is rendering Cause: the file was saved with a UTF-8 BOM and CRLF line endings, while the version on To unblock review, could you:
Once that lands the diff will render normally and reviewers can take a real look. One small thing for the substantive change: the rewrite dropped some of the inline comments around Gemini batching (free-tier 5 RPM vs paid-tier 2000 RPM rationale, the model-override env var note, the per-model benchmark numbers). Those are useful context for future maintainers — worth preserving in the OpenRouter version too. Separately, I'm opening a Thanks for the contribution! |
Add `* text=auto eol=lf` so text files are checked in with LF regardless of the contributor's OS. Without this, Windows editors can save files with CRLF (and sometimes a UTF-8 BOM), which makes every line differ at the byte level on diff and trips GitHub's "Binary file not shown" heuristic — see heygen-com#840 for an example where a ~30-line change was unreviewable for this reason. Existing LFS rules already carry `-text` and remain unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What this does
Adds OpenRouter as an alternative AI provider for image captioning in
hyperframes capture,alongside the existing Gemini integration.
How it works
Provider is selected at runtime based on which API key is present:
OPENROUTER_API_KEYgoogle/gemma-4-26b-a4b-itGEMINI_API_KEY/GOOGLE_API_KEYOpenRouter is prioritized if both keys are present.
Why
Not everyone has access to a Gemini API key. OpenRouter provides access to
vision-capable models (including Gemma) with a single unified API, making
this feature accessible to more users.
Backwards compatibility
✅ Purely additive — the Gemini code path is completely untouched.
✅ Users with no API keys see identical behavior to before.
✅ Verified working end-to-end on a real site capture.