Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance

ACL 2024: Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance

Introduction 📝

This repository contains the code related to the paper "Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance". In this paper, we delve into the underlying factors contributing to reasoning errors in Large Language Models (LLMs) and introduce Uncertainty-aware Adaptive Guidance (UAG), a novel approach for guiding LLM reasoning onto an accurate and reliable trajectory.

Quick Links 🔗

Requirements 📚
Data 💾
Quick Start 🚀
- Parameter Explanations
- Tips
Bug or Questions? 🤔
Citation 📖

Requirements 📚

Please make sure you have the following requirements installed:

transformers >= 4.46.2
torch
numpy
sklearn
tqdm

Data 💾

Our dataset originates from Large Language Models are Zero-Shot Reasoners, generously shared by Takeshi Kojima. We select prompts from Chain-of-Thought Prompting Elicits Reasoning in Large Language Models to guide models in generating initial reasoning processes.

Sample Outputs 📋

We share some outputs of UAG through Google Drive.

Quick Start 🚀

We provide a simple script in main.py to launch the GSM8K task using UAG:

subprocess.call(
    "python uag.py \
        --task GSM8K \
        --data-path GSM8K_input.jsonl \
        --record-path GSM8K_output.jsonl \
        --demonstration-path GSM8K_demonstration.jsonl",
    shell=True,
)

Parameter Explanations

To help you better understand and utilize the script, here is a detailed explanation of the parameters, grouped by their functionalities:

Task and Data Parameters:

--task: (str, default="GSM8K")
Description: The name of the task to perform. Examples include "GSM8K", "AQuA", "CSQA", etc.
--data-path: (str, default="UAG_input.jsonl")
Description: The path to the input data file in JSON Lines format.
--record-path: (str, default="UAG_output.jsonl")
Description: The path where the output records will be saved.
--demonstration-path: (str, default=None)
Description: The path to the demonstrations file. This can be used to specify task-specific prompts.
--demonstration-number: (int, default=16)
Description: The number of demonstrations to use. This helps in constructing diverse and high-quality demonstrations.

Uncertainty and Adjustment Parameters:

--theta: (float, default=16)
Description: The uncertainty threshold. This determines when UAG should intervene in the reasoning process.
--lambda1: (float, default=0.5)
Description: Weight for the relevance score in the uncertainty computation.
--lambda2: (float, default=0.5)
Description: Weight for the originality score in the uncertainty computation.
--k: (int, default=8)
Description: The number of clusters for demonstration clustering.

Generation Parameters:

--temperature: (float, default=0.5)
Description: The temperature for text generation. Higher values result in more diverse outputs.
--max-length: (int, default=2048)
Description: The maximum sequence length for generated outputs.
--max-loop: (int, default=10)
Description: The maximum number of loops (iterations) allowed in the reasoning process.

System Parameters:

--device: (str, default="cuda"`` if available, else "cpu") **Description**: The device to use for computation. Can be "cuda"or"cpu"`.

Tips

We use the open-source embedding model "nvidia/NV-Embed-v2", which eliminates the cost overhead associated with paid embedding models. We have found that this embedding model has good consistency with the Mistral model.
In prompt.py, we provide some prompt examples. We recommend constructing diverse and high-quality demonstrations for different tasks. You can specify --demonstration-path to further expand task-specific prompts.
You can adapt UAG to new reasoning tasks and scenarios by modifying the adaptive_reasoning_adjustment function in uag.py. Additionally, you can adjust hyperparameters like theta, lambda1, lambda2, and the generation temperature to achieve better performance. You can also customize your uncertainty calculation method; we have provided the compute_uncertainties function to facilitate the construction of task-specific uncertainty computation methods.

Bug or Questions? 🤔

If you have any suggestions or questions, feel free to email us at yinzhangyue@126.com. If you encounter any issues while using the code, or if you find any bugs, please open a new issue on GitHub. We are very open to any constructive feedback that could help us improve. Thank you for your attention!

Citation 📖

If you are interested in our work, please use the following citation format when referencing our paper:

@inproceedings{yin-etal-2024-reasoning,
    title = "Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance",
    author = "Yin, Zhangyue  and
      Sun, Qiushi  and
      Guo, Qipeng  and
      Zeng, Zhiyuan  and
      Li, Xiaonan  and
      Dai, Junqi  and
      Cheng, Qinyuan  and
      Huang, Xuanjing  and
      Qiu, Xipeng",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.acl-long.131",
    doi = "10.18653/v1/2024.acl-long.131",
    pages = "2401--2416"
}

Name		Name	Last commit message	Last commit date
Latest commit History 305 Commits
code		code
figures		figures
results		results
2024.acl-long.131.pdf		2024.acl-long.131.pdf
GSM8K_input.jsonl		GSM8K_input.jsonl
LICENSE		LICENSE
README.md		README.md
StrategyQA.json		StrategyQA.json
requirements.txt		requirements.txt
run_baseline_gsm8k_comprehensive.pbs		run_baseline_gsm8k_comprehensive.pbs
run_baseline_strategyqa_comprehensive.pbs		run_baseline_strategyqa_comprehensive.pbs
run_uag_gsm8k_fixed.pbs		run_uag_gsm8k_fixed.pbs
run_uag_strategyqa_fixed.pbs		run_uag_strategyqa_fixed.pbs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance

Introduction 📝

Quick Links 🔗

Requirements 📚

Data 💾

Sample Outputs 📋

Quick Start 🚀

Parameter Explanations

Task and Data Parameters:

Uncertainty and Adjustment Parameters:

Generation Parameters:

System Parameters:

Tips

Bug or Questions? 🤔

Citation 📖

About

Uh oh!

Releases

Contributors 5

Uh oh!

Languages

License

Rqcker/test

Folders and files

Latest commit

History

Repository files navigation

Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance

Introduction 📝

Quick Links 🔗

Requirements 📚

Data 💾

Sample Outputs 📋

Quick Start 🚀

Parameter Explanations

Task and Data Parameters:

Uncertainty and Adjustment Parameters:

Generation Parameters:

System Parameters:

Tips

Bug or Questions? 🤔

Citation 📖

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors 5

Uh oh!

Languages