feat(examples): add custom HTTP embedding example for LM Studio / Ollama by cluster2600 · Pull Request #149 · alibaba/zvec

cluster2600 · 2026-02-19T09:32:20Z

Summary

This PR adds a self-contained example showing how to use any OpenAI-compatible HTTP embedding endpoint (LM Studio, Ollama, vLLM, LocalAI, …) as the embedding source in zvec.

What's added

examples/custom_http_embedding.py


`HTTPEmbeddingFunction`	A zero-dependency class (stdlib only) that calls any `/v1/embeddings` endpoint, caches results with `@lru_cache`, and satisfies the `DenseEmbeddingFunction` protocol.
Collection setup	HNSW index with cosine similarity, dimension auto-detected from the server response.
Insert + query	5 sample documents embedded on the fly, then a semantic search query.
CLI interface	`--base-url`, `--model`, `--api-key`, `--collection-path` flags for easy customisation.
README-style header	Step-by-step instructions for LM Studio and Ollama embedded at the top of the file.

Usage

# LM Studio (default)
python examples/custom_http_embedding.py

# Ollama
python examples/custom_http_embedding.py \
    --base-url http://localhost:11434 \
    --model nomic-embed-text

Motivation

The existing extensions (OpenAIDenseEmbedding, etc.) require the openai package and are primarily designed for cloud APIs. Many developers want to use local inference servers without extra dependencies. This example shows the pattern using only Python stdlib, making it easy to adapt or inline.

Testing

The example runs end-to-end against a live LM Studio instance on localhost:1234. No new test infrastructure is required for a standalone script.

CLAassistant · 2026-02-19T09:32:28Z

All committers have signed the CLA.

Cuiyus · 2026-02-26T08:40:34Z

Thank you for your submission!

This service-oriented model for invocation, which helps zvec achieve the RAG capability, is what we currently lack.
You can implement your HttpEmbedingFunction(OllmaEmbedingFunction) by inheriting DenseEmbedingFunction in the directory python/zvec/extension. This will make it easier for more users to use!

https://zvec.org/api-reference/python/extension/#zvec.extension.DenseEmbeddingFunction

cluster2600 · 2026-02-26T09:06:56Z

Thanks for the feedback! Moved the implementation into python/zvec/extension/http_embedding_function.py as HTTPDenseEmbedding, inheriting from DenseEmbeddingFunction. It's now exported from zvec.extension and the example imports it from there instead of defining the class inline.

Move the HTTP embedding implementation from the example script into python/zvec/extension/ as HTTPDenseEmbedding, inheriting from DenseEmbeddingFunction. The example now imports from zvec.extension instead of defining the class inline. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

Move zvec imports to top-level, add noqa for print statements, replace os.path.exists with pathlib, fix import sorting. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

The vector_column_indexer_test failure is a known flaky assertion in hnsw_streamer_entity.h, unrelated to Python-only changes in this PR. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

Cuiyus · 2026-02-27T07:25:18Z

@greptile

greptile-apps · 2026-02-27T07:29:20Z

Greptile Summary

Added HTTPDenseEmbedding, a new extension class that provides dense text embeddings using OpenAI-compatible HTTP endpoints (LM Studio, Ollama, vLLM, LocalAI). The implementation uses only Python stdlib (urllib, json) with no external dependencies.

Key changes:

Implemented HTTPDenseEmbedding class with auto-detected dimensions and LRU caching (@lru_cache(maxsize=256))
Exported the new class from zvec.extension following existing patterns
Supports configurable base URL, model, API key, and timeout parameters

Notes:

The PR description mentions adding examples/custom_http_embedding.py, but that file was removed in commit 327d718. The actual change adds a production extension, not an example.
Consider adding dimension validation (like OpenAIDenseEmbedding does) to catch server inconsistencies
The dimension detection via embed("dimension probe") is clever but pollutes the cache with one entry

Confidence Score: 4/5

Safe to merge with minor concerns about missing dimension validation
The implementation follows existing patterns from OpenAIDenseEmbedding and JinaDenseEmbedding. Code is well-documented, has proper error handling, and uses appropriate Python stdlib APIs. The main concern is the lack of dimension validation that could lead to silent bugs if the server behaves inconsistently. The PR description being outdated is a documentation issue, not a code issue.
No files require special attention - http_embedding_function.py could benefit from dimension validation but is otherwise solid

Important Files Changed

Filename	Overview
python/zvec/extension/init.py	Added import and export for new HTTPDenseEmbedding class following existing patterns
python/zvec/extension/http_embedding_function.py	New embedding class for OpenAI-compatible endpoints. Missing dimension validation unlike OpenAIDenseEmbedding. PR description outdated.

_{Last reviewed commit: e91e5cd}

examples/custom_http_embedding.py

Per maintainer feedback, examples requiring an external LLM server belong in the zvec-web project rather than in this repository. Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

cluster2600 · 2026-02-27T09:24:31Z

Removed the example file as requested — server-dependent examples belong in zvec-web.

Cuiyus · 2026-03-02T06:45:03Z

Removed the example file as requested — server-dependent examples belong in zvec-web.

I didn't find any pull requests related to HttpEmbedingFunction on the link https://github.com/zvec-ai/zvec-web/pulls.

It would be helpful if you could add some relevant explanations about the HttpEmbedingFunction in the zvec-web section, so that more users can use it.

Cuiyus · 2026-03-02T11:55:32Z

@greptile

greptile-apps

_{2 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-03-02T12:01:56Z

python/zvec/extension/http_embedding_function.py

+            vector: list[float] = body["data"][0]["embedding"]
+        except (KeyError, IndexError) as exc:
+            raise ValueError(
+                f"Unexpected response format from embedding server: {body}"
+            ) from exc
+
+        return vector


Unlike OpenAIDenseEmbedding (line 232-236), this doesn't validate that the returned vector dimension matches self.dimension. If the server returns inconsistent dimensions across calls, this could lead to silent failures downstream.

Suggested change

vector: list[float] = body["data"][0]["embedding"]

except (KeyError, IndexError) as exc:

raise ValueError(

f"Unexpected response format from embedding server: {body}"

) from exc

return vector

try:

vector: list[float] = body["data"][0]["embedding"]

except (KeyError, IndexError) as exc:

raise ValueError(

f"Unexpected response format from embedding server: {body}"

) from exc

if self._dimension is not None and len(vector) != self._dimension:

raise ValueError(

f"Dimension mismatch: expected {self._dimension}, got {len(vector)}"

)

return vector

feihongxu0824 assigned Cuiyus Feb 19, 2026

Cuiyus self-requested a review February 26, 2026 08:22

cluster2600 added 2 commits February 26, 2026 10:11

feat(examples): add custom HTTP embedding example for LM Studio / Ollama

ca75c18

cluster2600 force-pushed the feat/lmstudio-custom-http-embedding branch from 31f67ed to 9a81b28 Compare February 26, 2026 09:11

cluster2600 added 3 commits February 26, 2026 10:14

fix(examples): resolve ruff lint errors in HTTP embedding example

e5dee48

Move zvec imports to top-level, add noqa for print statements, replace os.path.exists with pathlib, fix import sorting. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

style: apply ruff formatter

400dacf

Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

ci: retrigger CI (flaky macOS C++ test)

eb3960e

The vector_column_indexer_test failure is a known flaky assertion in hnsw_streamer_entity.h, unrelated to Python-only changes in this PR. Signed-off-by: Maxime <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

Cuiyus reviewed Feb 27, 2026

View reviewed changes

examples/custom_http_embedding.py Outdated Show resolved Hide resolved

chore: remove custom HTTP embedding example

327d718

Per maintainer feedback, examples requiring an external LLM server belong in the zvec-web project rather than in this repository. Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

Merge branch 'main' into feat/lmstudio-custom-http-embedding

697a503

Cuiyus approved these changes Mar 2, 2026

View reviewed changes

Cuiyus closed this Mar 2, 2026

Cuiyus reopened this Mar 2, 2026

Merge branch 'main' into feat/lmstudio-custom-http-embedding

e91e5cd

greptile-apps bot reviewed Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(examples): add custom HTTP embedding example for LM Studio / Ollama#149

feat(examples): add custom HTTP embedding example for LM Studio / Ollama#149
cluster2600 wants to merge 8 commits intoalibaba:mainfrom
cluster2600:feat/lmstudio-custom-http-embedding

cluster2600 commented Feb 19, 2026

Uh oh!

CLAassistant commented Feb 19, 2026 •

edited

Loading

Uh oh!

Cuiyus commented Feb 26, 2026

Uh oh!

cluster2600 commented Feb 26, 2026

Uh oh!

Cuiyus commented Feb 27, 2026

Uh oh!

greptile-apps bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

cluster2600 commented Feb 27, 2026

Uh oh!

Cuiyus commented Mar 2, 2026

Uh oh!

Cuiyus commented Mar 2, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cluster2600 commented Feb 19, 2026

Summary

What's added

Usage

Motivation

Testing

Uh oh!

CLAassistant commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cuiyus commented Feb 26, 2026

Uh oh!

cluster2600 commented Feb 26, 2026

Uh oh!

Cuiyus commented Feb 27, 2026

Uh oh!

greptile-apps bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

Uh oh!

cluster2600 commented Feb 27, 2026

Uh oh!

Cuiyus commented Mar 2, 2026

Uh oh!

Cuiyus commented Mar 2, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Feb 19, 2026 •

edited

Loading

greptile-apps bot commented Feb 27, 2026 •

edited

Loading