Skip to content

feat: implement semantic sub-query decomposition and parallel retrieval with RRF merge#587

Open
knoxiboy wants to merge 1 commit into
param20h:devfrom
knoxiboy:feat/issue-566-query-decomposition
Open

feat: implement semantic sub-query decomposition and parallel retrieval with RRF merge#587
knoxiboy wants to merge 1 commit into
param20h:devfrom
knoxiboy:feat/issue-566-query-decomposition

Conversation

@knoxiboy

Copy link
Copy Markdown

📋 PR Checklist


🔗 Related Issue

Closes #566


📝 What does this PR do?

Implements semantic sub-query decomposition for multi-part user questions:

  1. Updates transform_query inside retriever.py to use a dedicated prompt asking the LLM to identify distinct semantic components in a complex query (e.g. comparing topics) and return them as a list of sub-queries.
  2. Executes dense vector and BM25 queries in parallel (using concurrent.futures.ThreadPoolExecutor) for each sub-query.
  3. Merges candidate lists from all sub-queries using a generalized Reciprocal Rank Fusion (RRF) algorithm.

🗂️ Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 🔧 Refactor / code cleanup
  • 📝 Documentation update
  • 🎨 UI / styling change
  • ⚙️ CI / tooling / config change
  • 🧪 Tests

🧪 How was this tested?

  • Ran the backend locally
  • Verified retrieval pipeline manually

📸 Screenshots (if UI change)


⚠️ Anything to flag for reviewers?

None.


✅ Self-Review Checklist

  • My branch is based on dev, not main
  • I have not added any secrets / API keys
  • I have not modified main branch or any HuggingFace deployment config
  • My code follows the existing style (no unnecessary formatting changes)
  • I have updated relevant docs / comments if needed

@knoxiboy knoxiboy requested a review from param20h as a code owner June 13, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Implement Semantic Sub-query Decomposition for Multi-part User Questions

1 participant