Skip to content

fix: honor local embedding model option#793

Open
mturac wants to merge 1 commit into
rohitg00:mainfrom
mturac:fix/issue-725-local-embedding-model
Open

fix: honor local embedding model option#793
mturac wants to merge 1 commit into
rohitg00:mainfrom
mturac:fix/issue-725-local-embedding-model

Conversation

@mturac
Copy link
Copy Markdown

@mturac mturac commented Jun 2, 2026

Summary

  • pass the configured local embedding model through to the embedding provider
  • keep the existing local endpoint/model defaults when no override is set
  • add regression coverage for custom local embedding model selection

Fixes #725

Validation

  • git diff --check
  • npm test -- test/embedding-provider.test.ts
  • npm run build

Summary by CodeRabbit

  • New Features
    • Local embedding model selection is now configurable via environment variable, allowing users to specify custom models while maintaining a default multilingual model fallback.

@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 2, 2026

@mturac is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aa9dfceb-3bd5-4f48-90a3-d4724e792d73

📥 Commits

Reviewing files that changed from the base of the PR and between d442fee and 1f0eb55.

📒 Files selected for processing (2)
  • src/providers/embedding/local.ts
  • test/embedding-provider.test.ts

📝 Walkthrough

Walkthrough

The PR makes the local embedding provider's model selection configurable via the LOCAL_EMBEDDING_MODEL environment variable, adding a multilingual default (paraphrase-multilingual-MiniLM-L12-v2) that replaces the hardcoded English-only model. Tests confirm both default and custom model scenarios work correctly.

Changes

Local Embedding Model Configuration

Layer / File(s) Summary
Configurable embedding model constant and pipeline initialization
src/providers/embedding/local.ts
Exports DEFAULT_LOCAL_EMBEDDING_MODEL constant and updates the transformer pipeline to read from LOCAL_EMBEDDING_MODEL environment variable with fallback.
LocalEmbeddingProvider test suite and mock infrastructure
test/embedding-provider.test.ts
Mocks @xenova/transformers, imports the new provider and constant, extends test cleanup to reset environment and mocks, and adds test cases for default and custom model behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

  • rohitg00

Poem

🐰 Whiskers twitch as models bloom,
Multilingual fills the room,
English-only? No more, dear friend,
Russian queries find their end,
Xenova dances, configured free!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: making the local embedding model configurable via environment variable rather than hardcoded.
Linked Issues check ✅ Passed The PR fully satisfies issue #725: makes local embedding model configurable via LOCAL_EMBEDDING_MODEL, changes default to multilingual model (Xenova/paraphrase-multilingual-MiniLM-L12-v2), and adds regression test coverage.
Out of Scope Changes check ✅ Passed All changes are directly within scope: configurable model support, multilingual default, environment variable handling, and corresponding test coverage—no extraneous modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

local embedding provider hardcodes English-only model — make it configurable (multilingual default)

1 participant