Skip to content

sodestream/revrec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

revrec

Data and code for the following paper.

Vanja M Karan, Stephen McQuistin, Ryo Yanagida, Colin Perkins, Gareth Tyson, Ignacio Castro, Patrick Healey, Matthew Purver, A Dataset for Expert Reviewer Recommendation with Large Language Models as Zero-shot Rankers, Proceedings of The 31st International Conference on Computational Linguistics, 2025

Quickstart

To reproduce results form the paper call:

python eval.py results-sts-[variant]-ta-[dataset].pickle

Where:

[variant] determines which model variant you want to evaluate, and consists of two components:

  • llama3 (has "3") or llama2 (has empty string)
  • size of the model "7b", "8b" or, "70b" E.g., 8b llama2 corresponds to "8b" and 70b llama3 corresponds to "3-70b"

[dataset] determines which dataset - can be "ietf", "nips", or "stelmakh"

Data

Prompts used in the paper are available in prompts-sts-ta-[dataset].pickle files. They are a list of tuples of the form (label_data, prompt_text). Each of them represents a reviewer and a pair of papers which must be rated by the LLM, along with the true label. More details below:

  • label_data is a tuple of the form (reviewer_id, paperA_id, paperB_id, correct_label, gold_score_distance)
    • ids refer to the original data files, correct_label is "first" or "second", and gold_score_distance is the difference between the scores used for calculating the evaluation metric
  • prompt_text is a string generated using the following template:
  prompt = """[INST] <<SYS>> You are an expert pairing reviewers with suitable papers to review. <</SYS>> 
           The description of the reviewer is as follows:\n {reviewer_description} \n\n\n 
           Description of paper A:\n {paperA_description} \n\n\n 
           Description of paper B: \n {paperB_description} \n\n\n 
           Which paper is more relevant for this reviewer (your answer must be either 'paper A' or 'paper B')?
           [/INST] My answer is:"""

(for datasets where there is a single paper and a pair of reviewers the prompt and label data is analogous as above but prompts and reviewers are switched)

About

Data and code for a COLING 2025 paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages