Knowledge base feature by bulletinmybeard · Pull Request #7 · bulletinmybeard/agent-forge

bulletinmybeard · 2026-06-21T05:18:28Z

Summary

Adds a personal Knowledge Database to agentforge-api (/knowledge/*): a store
for user-created entries — snippets, commands, URLs, configs, error solutions,
notes, API examples — in its own Qdrant collection (knowledge_entries),
separate from the RAG index. Same embedding pipeline, dedicated CRUD + search.

What's included

API (/knowledge/*)

CRUD: create (single + batch up to 100), get, update, delete, bulk-delete by filter
Search: semantic /search and /search/smart with tag/type/project filters, plus tag faceting and stats
/filter: list entries by metadata filters (incl. parent_id) without a vector search
/entries/{id}/context: most relevant passages from one entry for a query, with adjacent pages for context
/entries/{id}/rechunk: rebuild page chunks for entries indexed before chunking existed
/extract: server-side text extraction from uploads (PDF via pdfplumber, pdftotext fallback; text/code/config as UTF-8), reusing AgentForge's extraction path instead of frontend JS

Behavior

Smart re-indexing on update — re-embeds only when title, content, or notes change; metadata-only edits skip the embedding call
Parent/child attachments via parent_id, with per-page chunking so a parent and its attached documents are searchable as passages
metadata free-form field on all points
SAQ batch job for bulk ingestion

Config

New knowledge block: collection_name, dedup_threshold, composite_template (env prefix KNOWLEDGE_)

Implementation notes

knowledge_service.py orchestrates embed -> dedup -> upsert; knowledge_vector_service.py owns the Qdrant collection (lazy client, payload indexes, page-chunk search).
Made embedding_service a lazy proxy so importing the service stack no longer builds the embedding client at import time — keeps test collection working without a config.yaml (gitignored, absent in CI). Construction defers to the first real .embed() call.

Testing

54 new test cases across 5 files (models, service, vector service, routes, batch job); all mock Qdrant/Ollama.
Full suite: 122 passed. ruff check + ruff format --check clean.

Docs

Updated README, docs/api.md, docs/architecture.md, docs/README.md, config.example.yaml.
CHANGELOG under 0.8.0; version bumped 0.7.0 -> 0.8.0.

…sponse)

… group. We already have in AgentForge proper content extraction tools implemented and reuse them instead of trying to extract large PDF and other documents via frontend JS package

… a vector search, to retrieve the most relevant passages from an entry (prompt query), to re-chunk kb attachments to parent and its own entry - Improve page marker chunking for kb entries - Improve overall kb chunking

- Bump up the app version (new release) - Update the CHANGELOG and documentation - Add the new knowledge base section to the `config.example.yaml`

bulletinmybeard added 8 commits June 20, 2026 12:38

Integrate from dev-agent-forge the personal Knowledge Base API feature

64869b2

Add a metadata field to all Qdrant kb collection points (request+re…

f2858c1

…sponse)

Add an content extraction endpoint to the the knowledge base endpoint…

293d916

… group. We already have in AgentForge proper content extraction tools implemented and reuse them instead of trying to extract large PDF and other documents via frontend JS package

Add the generated and vibe-coded tests from yesterday

a6c609a

Fix the broken kb tests

90d4e0b

- Refactor the tests to make them succeed

911223d

- Bump up the app version (new release) - Update the CHANGELOG and documentation - Add the new knowledge base section to the `config.example.yaml`

fix tests

e1a6828

bulletinmybeard merged commit a9fc662 into master Jun 21, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knowledge base feature#7

Knowledge base feature#7
bulletinmybeard merged 8 commits into
masterfrom
knowledge-base-feature

bulletinmybeard commented Jun 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bulletinmybeard commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Implementation notes

Testing

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bulletinmybeard commented Jun 21, 2026 •

edited

Loading