feat(embeddings): support passthrough remote model ids#156
Open
sanikolaev wants to merge 1 commit intomasterfrom
Open
feat(embeddings): support passthrough remote model ids#156sanikolaev wants to merge 1 commit intomasterfrom
sanikolaev wants to merge 1 commit intomasterfrom
Conversation
1. Allow explicit provider-prefixed passthrough model ids for remote endpoints - keep the existing slash-prefixed forms (openai/..., voyage/..., jina/...) working as before - add explicit colon-prefixed forms (openai:..., voyage:..., jina:...) - when the colon form is used, pass the model id through after stripping only the provider prefix - this allows OpenAI-compatible custom endpoints to receive full upstream model ids unchanged, for example: - openai:openai/text-embedding-ada-002 - openai:jinaai/jina-embeddings-v3 - preserve strict built-in validation for default provider endpoints while allowing passthrough mode for custom API_URL-based setups 2. Allow CMake to pass optional cargo features to the embeddings crate - add EMBEDDINGS_CARGO_FEATURE_ARGS in cmake/build_embeddings.cmake - if EMBEDDINGS_CARGO_FEATURES is set, convert it to a valid cargo CLI fragment: --features <value> - this makes it possible to configure builds such as download-ort from the CMake side without hard-coding the flag in the build script Additional remote-model adjustment: - cache inferred embedding dimensionality in remote providers so passthrough/custom models can learn their vector dimension from a successful response instead of requiring a built-in static mapping - apply that caching approach consistently across OpenAI, Voyage, and Jina
3d92e0e to
ecba6b3
Compare
Windows test results 5 files 5 suites 21m 18s ⏱️ For more details on these failures, see this check. Results for commit ecba6b3. ♻️ This comment has been updated with latest results. |
clt❌ CLT tests in Failed tests:🔧 Edit failed tests in UI: test/clt-tests/mcl/auto-embeddings-json-api.rec––– input –––
rm -f /var/log/manticore/searchd.log; stdbuf -oL searchd $SEARCHD_FLAGS > /dev/null; if timeout 10 grep -qm1 '\[BUDDY\] started' <(tail -n 1000 -f /var/log/manticore/searchd.log); then echo 'Buddy started!'; else echo 'Timeout or failed!'; cat /var/log/manticore/searchd.log;fi
––– output –––
OK
––– input –––
apt-get install jq -y > /dev/null; echo $?
––– output –––
- debconf: delaying package configuration, since apt-utils is not installed
+ E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/j/jq/libjq1_1.7.1-3ubuntu0.24.04.1_amd64.deb Connection failed [IP: 185.125.190.82 80]
- 0
+ E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
+ 100
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_json_columnar (
title TEXT,
content TEXT,
embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2'
MODEL_NAME='sentence-transformers/all-MiniLM-L6-v2'
FROM='title, content'
) engine='columnar'"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SHOW CREATE TABLE test_json_columnar" | grep -o "model_name='sentence-transformers/all-MiniLM-L6-v2'"
––– output –––
OK
––– input –––
curl -s -X POST http://localhost:9308/insert -d '{"index":"test_json_columnar","id":1,"doc":{"title":"machine learning","content":"neural networks"}}' | jq -r 'if ._id then ._id else "inserted" end'
––– output –––
- inserted
+ bash: line 19: jq: command not found
––– input –––
mysql -h0 -P9306 -e "SELECT id FROM test_json_columnar WHERE KNN(embedding, 1, 'machine learning neural networks')"
––– output –––
OK
––– input –––
curl -s -X POST http://localhost:9308/bulk -H "Content-Type: application/x-ndjson" -d '
{"insert":{"index":"test_json_columnar","id":2,"doc":{"title":"computer vision","content":"image recognition"}}}
{"insert":{"index":"test_json_columnar","id":3,"doc":{"title":"NLP","content":"text processing"}}}
' | jq '{created: .items[0].bulk.created}'
––– output –––
- {
+ bash: line 26: jq: command not found
- "created": 2
- }
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) FROM test_json_columnar WHERE id IN (2,3)"
––– output –––
OK
––– input –––
curl -s -X POST http://localhost:9308/replace -d '{"index":"test_json_columnar","id":1,"doc":{"title":"updated ML","content":"updated networks"}}' | jq -r '.result'
––– output –––
- updated
+ bash: line 30: jq: command not found
––– input –––
mysql -h0 -P9306 -e "SELECT title FROM test_json_columnar WHERE id=1 AND KNN(embedding, 1, 'updated ML networks')"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_json_columnar (id, title, content) VALUES (100, 'test', 'data')";
curl -s -X POST http://localhost:9308/insert -d '{"index":"test_json_columnar","id":101,"doc":{"title":"test","content":"data"}}' > /dev/null
––– output –––
OK
––– input –––
mysql -h0 -P9306 --batch --skip-column-names -e "SELECT embedding FROM test_json_columnar WHERE id=100" > /tmp/v1.txt
mysql -h0 -P9306 --batch --skip-column-names -e "SELECT embedding FROM test_json_columnar WHERE id=101" > /tmp/v2.txt
diff -q /tmp/v1.txt /tmp/v2.txt > /dev/null && echo "Vectors identical" || echo "Vectors differ"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) FROM test_json_columnar"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "FLUSH RAMCHUNK test_json_columnar; OPTIMIZE TABLE test_json_columnar OPTION sync=1, cutoff=1"; echo $?
––– output –––
OK
––– input –––
VECTOR=$(python3 -c "print(','.join(['0.01']*384))")
curl -s -X POST http://localhost:9308/search -d "{\"index\":\"test_json_columnar\",\"knn\":{\"field\":\"embedding\",\"query_vector\":[$VECTOR],\"k\":2}}" | jq -r '.hits.total // "0"'
––– output –––
- 5
+ bash: line 46: jq: command not found
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE no_auto_embed (title TEXT, vec FLOAT_VECTOR KNN_TYPE='hnsw' KNN_DIMS='384' HNSW_SIMILARITY='l2') engine='columnar'"
––– output –––
OK
––– input –––
VECTOR=$(python3 -c "print(','.join(['0.5']*384))")
curl -s -X POST http://localhost:9308/insert -d "{\"index\":\"no_auto_embed\",\"id\":1,\"doc\":{\"title\":\"test\",\"vec\":[$VECTOR]}}" | jq -r 'if ._id then ._id else "inserted" end'
––– output –––
- inserted
+ bash: line 51: jq: command not found
––– input –––
QUERY_VEC=$(python3 -c "print(','.join(['0.5']*384))")
curl -s -X POST http://localhost:9308/search -d "{\"index\":\"no_auto_embed\",\"knn\":{\"field\":\"vec\",\"query_vector\":[$QUERY_VEC],\"k\":1}}" | jq -r '.hits.total // "0"'
––– output –––
- 1
+ bash: line 54: jq: command not found
––– input –––
mysql -h0 -P9306 -e "CREATE TABLE test_json_rowwise (
title TEXT,
content TEXT,
embedding FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2'
MODEL_NAME='sentence-transformers/all-MiniLM-L6-v2'
FROM='title, content'
) engine='rowwise'"; echo $?
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SHOW CREATE TABLE test_json_rowwise" | grep -o "model_name='sentence-transformers/all-MiniLM-L6-v2'"
––– output –––
OK
––– input –––
curl -s -X POST http://localhost:9308/insert -d '{"index":"test_json_rowwise","id":1,"doc":{"title":"machine learning","content":"neural networks"}}' | jq -r 'if ._id then ._id else "inserted" end'
––– output –––
- inserted
+ bash: line 66: jq: command not found
––– input –––
mysql -h0 -P9306 -e "SELECT id FROM test_json_rowwise WHERE KNN(embedding, 1, 'machine learning neural networks')"
––– output –––
OK
––– input –––
curl -s -X POST http://localhost:9308/bulk -H "Content-Type: application/x-ndjson" -d '
{"insert":{"index":"test_json_rowwise","id":2,"doc":{"title":"computer vision","content":"image recognition"}}}
{"insert":{"index":"test_json_rowwise","id":3,"doc":{"title":"NLP","content":"text processing"}}}
' | jq '{created: .items[0].bulk.created}'
––– output –––
- {
+ bash: line 73: jq: command not found
- "created": 2
- }
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) FROM test_json_rowwise WHERE id IN (2,3)"
––– output –––
OK
––– input –––
curl -s -X POST http://localhost:9308/replace -d '{"index":"test_json_rowwise","id":1,"doc":{"title":"updated ML","content":"updated networks"}}' | jq -r '.result'
––– output –––
- updated
+ bash: line 77: jq: command not found
––– input –––
mysql -h0 -P9306 -e "SELECT title FROM test_json_rowwise WHERE id=1 AND KNN(embedding, 1, 'updated ML networks')"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "INSERT INTO test_json_rowwise (id, title, content) VALUES (100, 'test', 'data')";
curl -s -X POST http://localhost:9308/insert -d '{"index":"test_json_rowwise","id":101,"doc":{"title":"test","content":"data"}}' > /dev/null
––– output –––
OK
––– input –––
mysql -h0 -P9306 --batch --skip-column-names -e "SELECT embedding FROM test_json_rowwise WHERE id=100" > /tmp/v1.txt
mysql -h0 -P9306 --batch --skip-column-names -e "SELECT embedding FROM test_json_rowwise WHERE id=101" > /tmp/v2.txt
diff -q /tmp/v1.txt /tmp/v2.txt > /dev/null && echo "Vectors identical" || echo "Vectors differ"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "SELECT COUNT(*) FROM test_json_rowwise"
––– output –––
OK
––– input –––
mysql -h0 -P9306 -e "FLUSH RAMCHUNK test_json_rowwise; OPTIMIZE TABLE test_json_rowwise OPTION sync=1, cutoff=1"; echo $?
––– output –––
OK
––– input –––
VECTOR=$(python3 -c "print(','.join(['0.01']*384))")
curl -s -X POST http://localhost:9308/search -d "{\"index\":\"test_json_rowwise\",\"knn\":{\"field\":\"embedding\",\"query_vector\":[$VECTOR],\"k\":2}}" | jq -r '.hits.total // "0"'
––– output –––
- 5
+ bash: line 93: jq: command not found |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Additional remote-model adjustment:
Related issue #155