Skip to content

Feature Request: Multimodal Memory (Image + Text) #290

@tomcatzh

Description

@tomcatzh

Feature Request: Multimodal Memory (Image + Text)

Use Cases

  1. Inventory / asset management — Take a photo of an item, store it as a memory with description. Later retrieve by text query ("where's the blue storage box?") or by image similarity.

  2. People & faces — Store photos associated with people/entities. The agent can "remember what someone looks like."

  3. Visual notes — Screenshot a UI, diagram, or whiteboard; store it as a searchable memory alongside text annotations.

Why this fits memory-lancedb-pro

  • LanceDB natively supports multimodal data and CLIP-style embedding functions
  • The plugin already supports "any OpenAI-compatible embedding provider" — CLIP models (e.g. jina-clip-v2) expose the same /v1/embeddings endpoint for both text and images
  • Text and image vectors live in the same space with CLIP, so cross-modal retrieval (text query → image result) works out of the box at the vector level

Possible Approach (incremental)

Phase 1: Image attachment

  • Allow memory_store to accept an optional image (URL or base64)
  • Store image reference/data alongside the text in the memories table
  • Embed using a CLIP-compatible model; fall back to text-only embedding if the configured model doesn't support images

Phase 2: Image-aware retrieval

  • memory_recall returns image references when relevant
  • Support image-as-query (pass an image to find similar memories)

Notes

  • This doesn't need to replace the current text-only flow — it's additive. Users who don't configure a multimodal embedding model would see zero behavior change.
  • The schema change is modest: an optional image column (storing a URL/path/base64) in the memories table.
  • The main open question is whether this aligns with the plugin's scope, or if multimodal memory is better handled as a separate plugin/module.

Would love to hear thoughts on whether this direction is interesting for the project!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions