feat(rag): add IngestedDocument model and persist uploaded PDF metadata to database#898
Open
Pcmhacker-piro wants to merge 1 commit into
Open
Conversation
…ta to database Adds an IngestedDocument ORM model with Alembic migration for tracking uploaded regulatory PDFs. The ingest endpoint now persists filename, SHA-256 hash, file size, and chunk count for each uploaded document. The vector store merging logic (merge_into_vector_store) preserves existing FAISS index entries when ingesting new documents.
Author
|
the checks have passed. Could you please review and approve the pending workflows when you have a chance? Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #578
Adds an IngestedDocument database model and Alembic migration to persist metadata about uploaded regulatory PDFs in the RAG Intelligence module. The POST /api/v1/rag/ingest endpoint now records filename, SHA-256 hash, file size, and chunk count for each uploaded document. The vector store is updated to merge new documents into the existing FAISS index (via merge_into_vector_store) instead of rebuilding from scratch, preserving previously ingested content.
Type of Change
Checklist
Screenshots (if UI change)
N/A - Backend-only feature
CHANGED FILES
COMMITS
TESTING PERFORMED
FINAL STATUS