Description
Feature Request: Image File Support in Knowledge Sources
Problem
Currently, the knowledge agent only supports text-based sources (GitHub repos,
YouTube transcripts, custom APIs). Many real-world knowledge bases contain
valuable information embedded in images — diagrams, charts, screenshots,
scanned documents, and architecture diagrams — which the agent cannot process.
Proposed Solution
Add support for image files (.png, .jpg, .jpeg, .webp, .gif) as a
knowledge source type, enabling the agent to:
- Ingest images from configured sources (GitHub repo folders, URLs, uploads)
- Extract text/content via OCR or a multimodal LLM
- Store extracted content in the sandbox snapshot so existing
grep/cat
tools can search it like any other text file
- Preserve image metadata (filename, alt text, captions) alongside content
Use Cases
- Searching architecture diagrams or flowcharts stored as images
- Indexing scanned PDFs or documentation screenshots
- Querying infographics and charts containing embedded text
- Supporting docs where visuals are the primary communication medium
Additional context
No response
Description
Feature Request: Image File Support in Knowledge Sources
Problem
Currently, the knowledge agent only supports text-based sources (GitHub repos,
YouTube transcripts, custom APIs). Many real-world knowledge bases contain
valuable information embedded in images — diagrams, charts, screenshots,
scanned documents, and architecture diagrams — which the agent cannot process.
Proposed Solution
Add support for image files (
.png,.jpg,.jpeg,.webp,.gif) as aknowledge source type, enabling the agent to:
grep/cattools can search it like any other text file
Use Cases
Additional context
No response