Skip to content

feat(integrations): add LlamaIndex VectorStore adapter#7

Open
Davidobot wants to merge 1 commit into
mainfrom
feat/llama-index-vectorstore
Open

feat(integrations): add LlamaIndex VectorStore adapter#7
Davidobot wants to merge 1 commit into
mainfrom
feat/llama-index-vectorstore

Conversation

@Davidobot

Copy link
Copy Markdown
Contributor

Summary

Adds a LlamaIndex VectorStore adapter for LodeDB — LodeDBVectorStore, a llama_index.core.vector_stores.types.BasePydanticVectorStore backed by the local LodeDB SDK — alongside the existing LangChain adapter. Installable via the new lodedb[llama-index] extra.

This unlocks the LlamaIndex ecosystem directly and is the prerequisite for app-level integrations built on LlamaIndex's vector-store abstraction (e.g. PrivateGPT).

Design

The adapter is text-path. The LodeDB SDK is text-in and embeds internally — add()/search() take strings and there is no precomputed-vector entry point — so the adapter sets is_embedding_query=False and feeds LlamaIndex node text (node.get_content) and the raw query.query_str to LodeDB, which does the embedding. LlamaIndex's own embed_model is not used for indexing or querying. (Honoring it would require a vector-in LodeDB API that does not exist yet.)

Supported:

  • add / query / delete / delete_nodes, client, persist, async shims
  • node_ids and doc_ids query scoping (doc-id resolution is session-local via a ref-doc map)
  • exact-match metadata filter translation (EQ + AND)
  • exact top-k (VectorStoreQueryMode.DEFAULT)

Raises clearly where LodeDB cannot honor the contract (all documented in the module docstring): non-EQ / OR / nested filters, non-DEFAULT query modes, get() (no full-precision vector read), and get_nodes / filter-based deletion (no metadata enumeration).

Changes

  • src/lodedb/local/integrations/llama_index.py — the adapter
  • examples/llama_index_store.py — runnable example
  • tests/test_local_integrations.py — roundtrip + unsupported-ops tests (gated on the optional dep)
  • pyproject.tomlllama-index extra
  • README / CHANGELOG / examples README / integrations package docstring
  • .gitignore — broaden /data//data*/ so example data dirs (data_langchain/, data_llama_index/, …) stop leaking into git status

Testing

  • uv run pytest tests/test_local_integrations.py tests/test_import_boundary.py5 passed
  • uv run ruff check / ruff format --check → clean
  • uv run python examples/llama_index_store.py runs end-to-end on the real MiniLM model (semantically correct ranking)
  • Verified import lodedb does not import llama_index — the optional-dependency boundary is intact

Add `LodeDBVectorStore`, a LlamaIndex `BasePydanticVectorStore` backed by the
local LodeDB SDK, alongside the existing LangChain adapter. Installable via the
new `lodedb[llama-index]` extra.

The adapter is text-path: the LodeDB SDK is text-in and embeds internally (there
is no precomputed-vector entry point), so it sets `is_embedding_query=False` and
feeds node text and the raw query string to LodeDB. LlamaIndex's own embed_model
is therefore not used for indexing or querying.

- add / query / delete / delete_nodes, plus node_ids and doc_ids scoping
- exact-match metadata filter translation (EQ + AND)
- exact top-k only (VectorStoreQueryMode.DEFAULT)
- unsupported operators/conditions, non-DEFAULT modes, get(), get_nodes(), and
  filter-based deletion raise clearly (LodeDB exposes no vector read or metadata
  enumeration)

Also adds a runnable example and roundtrip + unsupported-ops tests (gated on the
optional dep), updates README/CHANGELOG/docs, and broadens the gitignore data-dir
pattern to /data*/ so example stores stop leaking into git status.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant