Skip to content

[FEAT] README contains inconsistent references to Pinecone and ChromaDB vector stores #555

@divyalankeshwar2007-svg

Description

Is your feature request related to a problem? Please describe.

The project has migrated its RAG backend to ChromaDB, and the active backend implementation uses ChromaDB for vector storage and retrieval. However, several parts of the documentation and legacy files still reference Pinecone.

For example:

README.md contains Pinecone setup references.
Some older Flask-based files still contain Pinecone-related code.
New contributors may become confused about which vector database is currently supported.

This inconsistency makes onboarding harder and can lead contributors to spend time configuring services that are no longer required by the active backend.

Describe the solution you'd like

I would like the documentation and repository structure to clearly reflect the currently supported vector database.

Possible improvements:

Review README.md for outdated Pinecone references.
Clearly document that the FastAPI backend uses ChromaDB.
Add notes indicating which files belong to the legacy Flask implementation.
Remove or mark obsolete Pinecone-related setup instructions where appropriate.
Improve contributor onboarding by providing a single source of truth for vector database configuration.

Describe alternatives you've considered

Contributors can manually inspect the codebase to determine which vector database is currently active, but this requires additional effort and may still cause confusion for new contributors.

Additional Context

During repository exploration, the active FastAPI RAG implementation was found to use ChromaDB components such as:

backend/app/rag/vectorstore.py
backend/app/rag/retriever.py

At the same time, Pinecone references still appear in documentation and legacy files, which may create ambiguity about the project's current architecture.

This issue focuses on improving documentation consistency and contributor experience.

GSSoC '26

  • Yes, I am participating in GirlScript Summer of Code and would like to build this.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or improvementgssocGirlScript Summer of Code 2026 issue/PR

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions