Add OpenRouter app attribution headers#708
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds optional OpenRouter app attribution settings and utilities, wires conditional attribution headers into OpenAI client construction for registry and embedding clients, updates tests, and introduces local-embedding tooling: a docker-compose for local testing, a re-embed pgvector script, and a semantic benchmark CLI. ChangesOpenRouter Attribution
Local Embeddings & Tooling
🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers:
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/llm/test_embedding_client.py`:
- Around line 39-45: The FakeOpenAIClient used in
test_openai_embedding_client_rejects_dimension_mismatch must accept the
default_headers kwarg because _EmbeddingClient forwards default_headers into
AsyncOpenAI; update FakeOpenAIClient's __init__ signature to include
default_headers: dict[str,str] | None = None (and store or ignore it) so
instantiation from AsyncOpenAI won't raise TypeError, keeping the rest of the
fake behavior intact to allow the test to reach the dimension-mismatch
assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 84710182-0de4-4e1b-b82e-ee2901b76a8f
📒 Files selected for processing (7)
config.toml.examplesrc/config.pysrc/embedding_client.pysrc/llm/openrouter.pysrc/llm/registry.pytests/llm/test_embedding_client.pytests/llm/test_openrouter_attribution.py
| def __init__( | ||
| self, | ||
| *, | ||
| api_key: str | None, | ||
| base_url: str | None, | ||
| default_headers: dict[str, str] | None = None, | ||
| ) -> None: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify _EmbeddingClient passes default_headers and all FakeOpenAIClient
# test doubles in this file accept the kwarg.
rg -n -C2 'AsyncOpenAI\(' src/embedding_client.py
rg -n -C3 'class FakeOpenAIClient|def __init__\(' tests/llm/test_embedding_client.pyRepository: plastic-labs/honcho
Length of output: 1658
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show how _EmbeddingClient constructs AsyncOpenAI (and whether default_headers is passed)
echo "== src/embedding_client.py: AsyncOpenAI construction =="
sed -n '150,230p' src/embedding_client.py
# Show the FakeOpenAIClient definitions with their full init signatures
echo
echo "== tests/llm/test_embedding_client.py: FakeOpenAIClient init signatures =="
sed -n '1,220p' tests/llm/test_embedding_client.py | nl -ba | sed -n '20,120p'
sed -n '1,220p' tests/llm/test_embedding_client.py | nl -ba | sed -n '140,220p'Repository: plastic-labs/honcho
Length of output: 3768
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== tests/llm/test_embedding_client.py: FakeOpenAIClient init signatures (excerpts) =="
# Full file might be small; otherwise just focus on likely ranges
wc -l tests/llm/test_embedding_client.py
cat -n tests/llm/test_embedding_client.py | sed -n '1,130p'
echo
cat -n tests/llm/test_embedding_client.py | sed -n '130,230p'Repository: plastic-labs/honcho
Length of output: 8720
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "class FakeOpenAIClient" tests/llm/test_embedding_client.py
echo "----"
rg -n "def __init__" tests/llm/test_embedding_client.py | head -n 50
echo "----"
# Print the two FakeOpenAIClient blocks fully (small and targeted)
sed -n '30,110p' tests/llm/test_embedding_client.py
echo "----"
sed -n '150,210p' tests/llm/test_embedding_client.pyRepository: plastic-labs/honcho
Length of output: 4676
Update FakeOpenAIClient in test_openai_embedding_client_rejects_dimension_mismatch to accept default_headers.
_EmbeddingClient passes default_headers into AsyncOpenAI, but the fake in that test doesn’t accept the kwarg, so the test will fail with TypeError before reaching the dimension-mismatch assertion.
Proposed fix
class FakeOpenAIClient:
- def __init__(self, *, api_key: str | None, base_url: str | None) -> None:
+ def __init__(
+ self,
+ *,
+ api_key: str | None,
+ base_url: str | None,
+ default_headers: dict[str, str] | None = None,
+ ) -> None:
self.embeddings: FakeOpenAIEmbeddingsAPI = fake_embeddings📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def __init__( | |
| self, | |
| *, | |
| api_key: str | None, | |
| base_url: str | None, | |
| default_headers: dict[str, str] | None = None, | |
| ) -> None: | |
| class FakeOpenAIClient: | |
| def __init__( | |
| self, | |
| *, | |
| api_key: str | None, | |
| base_url: str | None, | |
| default_headers: dict[str, str] | None = None, | |
| ) -> None: | |
| self.embeddings: FakeOpenAIEmbeddingsAPI = fake_embeddings |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/llm/test_embedding_client.py` around lines 39 - 45, The
FakeOpenAIClient used in test_openai_embedding_client_rejects_dimension_mismatch
must accept the default_headers kwarg because _EmbeddingClient forwards
default_headers into AsyncOpenAI; update FakeOpenAIClient's __init__ signature
to include default_headers: dict[str,str] | None = None (and store or ignore it)
so instantiation from AsyncOpenAI won't raise TypeError, keeping the rest of the
fake behavior intact to allow the test to reach the dimension-mismatch
assertion.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/reembed_pgvector.py (1)
35-46: 💤 Low valueSQL table interpolation is safe here (S608 false positive).
Ruff flags the f-string SQL, but
tableoriginates only from theTABLESconstant / argparsechoices, so it is not externally influenced. No action needed; optionally add anassert table in TABLESat the top of these helpers as defense against future refactors that call them with unchecked input.Also applies to: 49-65
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/reembed_pgvector.py` around lines 35 - 46, The SQL f-string in async def _count_remaining(table: str) is flagged but table is only allowed from TABLES/argparse choices; add a defensive check at the top of _count_remaining (and the other helper functions around lines 49-65 that also interpolate table) such as an assert or explicit if raising ValueError to ensure table in TABLES before executing the query so the linter/maintainers know the input is constrained and future refactors won’t introduce unsafe interpolation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@scripts/reembed_pgvector.py`:
- Around line 35-46: The SQL f-string in async def _count_remaining(table: str)
is flagged but table is only allowed from TABLES/argparse choices; add a
defensive check at the top of _count_remaining (and the other helper functions
around lines 49-65 that also interpolate table) such as an assert or explicit if
raising ValueError to ensure table in TABLES before executing the query so the
linter/maintainers know the input is constrained and future refactors won’t
introduce unsafe interpolation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f1f2cc93-fea9-45da-9507-45d439581872
📒 Files selected for processing (3)
docker-compose.local-embeddings.ymlscripts/reembed_pgvector.pyscripts/semantic_benchmark.py
✅ Files skipped from review due to trivial changes (1)
- docker-compose.local-embeddings.yml
Summary
Validation
Note: full pytest was not run locally because the production image excludes dev dependencies and the Pi host lacks the full Honcho test environment; CI should run the full suite.
Summary by CodeRabbit
New Features
Configuration
Docker
Tests