Skip to content

Add Image File Support as a Knowledge Source #45

@PeanutSplash

Description

@PeanutSplash

Description

Feature Request: Image File Support in Knowledge Sources

Problem

Currently, the knowledge agent only supports text-based sources (GitHub repos,
YouTube transcripts, custom APIs). Many real-world knowledge bases contain
valuable information embedded in images — diagrams, charts, screenshots,
scanned documents, and architecture diagrams — which the agent cannot process.

Proposed Solution

Add support for image files (.png, .jpg, .jpeg, .webp, .gif) as a
knowledge source type, enabling the agent to:

  1. Ingest images from configured sources (GitHub repo folders, URLs, uploads)
  2. Extract text/content via OCR or a multimodal LLM
  3. Store extracted content in the sandbox snapshot so existing grep/cat
    tools can search it like any other text file
  4. Preserve image metadata (filename, alt text, captions) alongside content

Use Cases

  • Searching architecture diagrams or flowcharts stored as images
  • Indexing scanned PDFs or documentation screenshots
  • Querying infographics and charts containing embedded text
  • Supporting docs where visuals are the primary communication medium

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions