feat(examples): add custom HTTP embedding example for LM Studio / Ollama#149
feat(examples): add custom HTTP embedding example for LM Studio / Ollama#149cluster2600 wants to merge 8 commits intoalibaba:mainfrom
Conversation
|
Thank you for your submission! This service-oriented model for invocation, which helps zvec achieve the RAG capability, is what we currently lack. https://zvec.org/api-reference/python/extension/#zvec.extension.DenseEmbeddingFunction |
|
Thanks for the feedback! Moved the implementation into |
Move the HTTP embedding implementation from the example script into python/zvec/extension/ as HTTPDenseEmbedding, inheriting from DenseEmbeddingFunction. The example now imports from zvec.extension instead of defining the class inline. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
31f67ed to
9a81b28
Compare
Move zvec imports to top-level, add noqa for print statements, replace os.path.exists with pathlib, fix import sorting. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
The vector_column_indexer_test failure is a known flaky assertion in hnsw_streamer_entity.h, unrelated to Python-only changes in this PR. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
|
@greptile |
Greptile SummaryAdded Key changes:
Notes:
Confidence Score: 4/5
Important Files Changed
Last reviewed commit: e91e5cd |
Per maintainer feedback, examples requiring an external LLM server belong in the zvec-web project rather than in this repository. Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
|
Removed the example file as requested — server-dependent examples belong in zvec-web. |
I didn't find any pull requests related to HttpEmbedingFunction on the link https://github.com/zvec-ai/zvec-web/pulls. It would be helpful if you could add some relevant explanations about the HttpEmbedingFunction in the zvec-web section, so that more users can use it. |
|
@greptile |
| vector: list[float] = body["data"][0]["embedding"] | ||
| except (KeyError, IndexError) as exc: | ||
| raise ValueError( | ||
| f"Unexpected response format from embedding server: {body}" | ||
| ) from exc | ||
|
|
||
| return vector |
There was a problem hiding this comment.
Unlike OpenAIDenseEmbedding (line 232-236), this doesn't validate that the returned vector dimension matches self.dimension. If the server returns inconsistent dimensions across calls, this could lead to silent failures downstream.
| vector: list[float] = body["data"][0]["embedding"] | |
| except (KeyError, IndexError) as exc: | |
| raise ValueError( | |
| f"Unexpected response format from embedding server: {body}" | |
| ) from exc | |
| return vector | |
| try: | |
| vector: list[float] = body["data"][0]["embedding"] | |
| except (KeyError, IndexError) as exc: | |
| raise ValueError( | |
| f"Unexpected response format from embedding server: {body}" | |
| ) from exc | |
| if self._dimension is not None and len(vector) != self._dimension: | |
| raise ValueError( | |
| f"Dimension mismatch: expected {self._dimension}, got {len(vector)}" | |
| ) | |
| return vector |
Summary
This PR adds a self-contained example showing how to use any OpenAI-compatible HTTP embedding endpoint (LM Studio, Ollama, vLLM, LocalAI, …) as the embedding source in zvec.
What's added
examples/custom_http_embedding.pyHTTPEmbeddingFunction/v1/embeddingsendpoint, caches results with@lru_cache, and satisfies theDenseEmbeddingFunctionprotocol.--base-url,--model,--api-key,--collection-pathflags for easy customisation.Usage
Motivation
The existing extensions (
OpenAIDenseEmbedding, etc.) require theopenaipackage and are primarily designed for cloud APIs. Many developers want to use local inference servers without extra dependencies. This example shows the pattern using only Python stdlib, making it easy to adapt or inline.Testing
The example runs end-to-end against a live LM Studio instance on
localhost:1234. No new test infrastructure is required for a standalone script.