Skip to content

fix: serialize FAISS index rebuilds with lock and atomic swap#886

Open
Pcmhacker-piro wants to merge 1 commit into
SdSarthak:mainfrom
Pcmhacker-piro:fix/faiss-concurrent-ingest-lock
Open

fix: serialize FAISS index rebuilds with lock and atomic swap#886
Pcmhacker-piro wants to merge 1 commit into
SdSarthak:mainfrom
Pcmhacker-piro:fix/faiss-concurrent-ingest-lock

Conversation

@Pcmhacker-piro
Copy link
Copy Markdown

Summary

Closes #761

Serializes concurrent FAISS index rebuilds so that overlapping /rag/ingest requests cannot overwrite each other or leave a partially-written index for readers. A process-level threading.Lock guards the rebuild path, the index is built in a temporary directory (never touching the live path), validated by reloading, then atomically swapped into settings.FAISS_INDEX_PATH via shutil.move.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor
  • Tests
  • Infra / CI

Checklist

  • I have read CONTRIBUTING.md
  • My code follows the project style (PEP 8 for Python, ESLint for TS)
  • I have added/updated tests where relevant
  • pytest backend/tests/ passes locally
  • I have not committed .env or any secrets
  • I have updated documentation if needed

Edge Cases Handled

Edge case How it's handled
Concurrent ingests _rag_index_lock serializes rebuilds — second request waits, never overwrites
Reader loads during ingest shutil.move is atomic on same filesystem — reader gets old or new, never partial
Corrupted staged index FAISS.load_local() validates before swap — raises, temp dir is cleaned up, live index untouched
No existing index (first ingest) os.path.exists check skipped, shutil.move creates the live path fresh
Temp dir cleanup TemporaryDirectory context manager removes staged path automatically (no-op if already moved)
Cross-filesystem temp tempfile.gettempdir() is same filesystem as CWD on standard Linux/macOS setups; shutil.move falls back to copy+delete if needed

Screenshots (if UI change)

Not applicable; backend reliability fix.

@Pcmhacker-piro
Copy link
Copy Markdown
Author

hii @SdSarthak

the checks have passed. Could you please review and approve the pending workflows when you have a chance? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concurrent RAG ingests can overwrite the shared FAISS index

1 participant