3.Summary Tree & Retriever Integration by santo0 · Pull Request #105 · georgia-tech-db/TokenSmith

santo0 · 2026-04-10T02:05:30Z

PRs structure

The PRs depend on the previous ones.

SummaryEntry + summary_tree.py
Builds LLM-generated summaries bottom-up across the section tree. Leaf nodes summarize sliding windows of adjacent chunks; internal nodes summarize their children's summaries. All summaries are embedded and persisted as a FAISS index (summary_index.faiss + summary_meta.json) under the run directory.

SectionSummaryRetriever (name="section_summary")
At query time, embeds the query and searches the summary FAISS index. Each matching summary distributes its cosine similarity score to every chunk it covers; a chunk's final score is the max across all hits. Lazy-loads the embedding model on first call.

Update benchmark_retrieval.py
Evaluates all three KG retrievers (kg_node, section_tree, section_summary) plus the existing FAISS/BM25 retrievers against a query set from tests/benchmarks.yaml. Supports optional LLM relevance grading via OpenRouter.

KG and Section retrievers integration in main.py
Each of kg_node, section_tree, and section_summary is only loaded if its weight in ranker_weights is non-zero. CanonicalLookup is built once and shared across KG retrievers.

…etrieval script for enhanced query evaluation

… retrievers

…un_kg_pipeline.py

santo0 · 2026-04-14T19:18:19Z

I'm trying to simplify this PR, I will notify when I'm done.

…ctor summary index building

santo0 added 8 commits April 2, 2026 10:38

feat: Implement MockCanonicalizer for caching canonicalization results

c4f59ac

fix: Correct JSON formatting in SYNONYM_PROMPT response structure

27991a4

feat: Introduce KGNodeRetriever, SectionTreeRetriever and benchmark r…

4822b78

…etrieval script for enhanced query evaluation

feat: Enhance knowledge graph retrieval with summary indexing and new…

c4db4b8

… retrievers

Merge branch 'kg-enhancement' into kg-summary-tree

3a80e5c

Merge branch 'kg-enhancement' into kg-summary-tree

1960a97

feat: Add summary tree configuration to kg_pipeline in config.yaml

5e2a349

refactor: Remove unused command-line arguments for summarization in r…

c53d2d6

…un_kg_pipeline.py

santo0 changed the title ~~Kg summary tree~~ Kg Summary Tree Apr 10, 2026

santo0 changed the title ~~Kg Summary Tree~~ Summary Tree & Retriever Integration Apr 10, 2026

santo0 marked this pull request as ready for review April 10, 2026 13:59

santo0 added 2 commits April 10, 2026 10:11

Merge branch 'kg-enhancement' into kg-summary-tree

baa881f

Merge branch 'kg-enhancement' into kg-summary-tree

deb6e03

santo0 changed the title ~~Summary Tree & Retriever Integration~~ 3.Summary Tree & Retriever Integration Apr 15, 2026

This was referenced Apr 15, 2026

1.Knowledge Graph Building Pipeline #97

Open

2.Canonicalization, Section Tree and KG Retriever #98

Open

santo0 added 5 commits April 15, 2026 16:50

Merge branch 'kg-enhancement' into kg-summary-tree

7a81167

feat: add SummaryTreeConfig and integrate into KGPipelineConfig; refa…

22baea0

…ctor summary index building

Merge branch 'kg-enhancement' into kg-summary-tree

74e6dc0

Merge branch 'kg-enhancement' into kg-summary-tree

26a4456

Merge branch 'kg-enhancement' into kg-summary-tree

8238edc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.Summary Tree & Retriever Integration#105

3.Summary Tree & Retriever Integration#105
santo0 wants to merge 15 commits into
kg-enhancementfrom
kg-summary-tree

santo0 commented Apr 10, 2026 •

edited

Loading

Uh oh!

santo0 commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

santo0 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PRs structure

Uh oh!

santo0 commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

santo0 commented Apr 10, 2026 •

edited

Loading