🗺️ BRIEF-PRO: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning

News

🔥 We released the training and evaluation code, the model checkpoint, and the training data.

🚀 Overview

🤖 BRIEF-PRO is a universal, lightweight compressor that distills relevant evidence for a given query from multiple retrieved documents into a concise summary for seamless integration into in-context RAG.

✨ Key Features

🔧 BRIEF-PRO pioneers the exploration of multi-hop reasoning and compression of RAG for long contexts of 10k+ words across diverse scenarios.
⚙️ A synthetic data pipeline, built on short-context seed data, is designed to synthesize long-context training data for compression learning.
🧩 BRIEF-PRO, trained on the curated dataset, generates concise summaries that accelerate the inference and enhance the accuracy of a wide range of small, large, and proprietary language models.

🏃🏻‍♀️ Training

📍 Installation:

conda create -n axolotl python=3.10 -y
conda activate axolotl

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

pip install packaging ninja
pip install flash-attn --no-build-isolation

cd Axolotl/
pip install --index-url https://download.pytorch.org/whl/cu124 torch==2.6.0+cu124
pip install xformers==0.0.29.post2
pip install axolotl==0.5.0 accelerate peft optimum bitsandbytes liger-kernel lm-eval

pip install -e .
pip install -e '.[deepspeed]'

You can also follow the instructions on the Axolotl website (we use axolotl==0.5.0) to set up the training environment.

💡 Running:

Then run the following command to start training:

bash ./BRIEF-Pro/train/Axolotl/examples/llama-3.2/train.sh

🔬 Evaluation

📍 Installation:

Follow VLLM to install the multidoc_vllm environment.

Follow LongBench to install the longbench environment.

You can also quickly set up the environments using the provided .yml files.

conda env create -f ./BRIEF-Pro/env/multidoc_vllm_env.yml
conda env create -f ./BRIEF-Pro/env/longbench_env.yml

📖 Data Preparation:

Download the musique.jsonl, hotpotqa.jsonl, and 2wikimqa.jsonl files from LongBench and place them in ./BRIEF-Pro/eval/LongBench/data/.
Run ./BRIEF-Pro/eval/LongBench/convert.py to align the test data titles with our training format.
We provide the SealQA test data file at ./BRIEF-Pro/eval/LongBench/unbiasdata/longseal.jsonl.

💡 Running:

Run the following command to for evaluation:

BRIEF-PRO as the Compressor

Llama-3.1-8B-Instruct / Llama-3.1-70B-Instruct as the Reader Model:

bash ./BRIEF-Pro/eval/test_pipe_testAll_UserControl.sh
bash ./BRIEF-Pro/eval/test_pipe_testSealQA_UserControl.sh

GPT-4.1-nano as the Reader Model:

python ./BRIEF-Pro/eval/GPT_pred.py

Citation

@misc{gu2025briefprouniversalcontextcompression,
      title={BRIEF-Pro: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning}, 
      author={Jia-Chen Gu and Junyi Zhang and Di Wu and Yuankai Li and Kai-Wei Chang and Nanyun Peng},
      year={2025},
      eprint={2510.13799},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.13799}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
BRIEF-Pro		BRIEF-Pro
BRIEF		BRIEF
src		src
README.md		README.md
index.html		index.html
index_BRIEF.html		index_BRIEF.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗺️ BRIEF-PRO: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning

News

🚀 Overview

✨ Key Features

🏃🏻‍♀️ Training

🔬 Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

JasonForJoy/BRIEF

Folders and files

Latest commit

History

Repository files navigation

🗺️ BRIEF-PRO: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning

News

🚀 Overview

✨ Key Features

🏃🏻‍♀️ Training

🔬 Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages