Skip to content

searchsim-org/persona-ir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

persona-ir — Persona-Driven Variance in IR Retrieval

Reference implementation and release artefacts for the paper "Persona-Driven Variance in IR Retrieval".

The repository measures how much a retriever's output depends on who is asking the same information need, across BM25 and four dense encoders (MiniLM-L6, MPNet-base, E5-base-v2, BGE-base-en-v1.5), on a 5,000-passage QReCC subset, under two persona-generation methodologies: a controlled activation-steering condition and an uncontrolled prompt condition.

Headline: under the controlled steered condition, BM25 returns a different top-1 document in 65% of (persona, need) pairs while MPNet does so in 26% — a 2.5× sparse-vs-dense gap with paired-permutation p = 0.008. The gap survives a random word-swap control, three BM25 configurations, a Qwen 2.5 7B simulator substitution, an 80-user PRISM real-user replication (1.30×), and per-axis decomposition.

Layout

persona-ir/
├── scripts/             # query generators, retrievers, evaluation, viewer build
├── results/             # JSON outputs of every experiment (phase 1–2 + E1–E6)
├── viewer/              # static HTML viewer of per-persona top-3 docs
├── figures/             # hero figure source PDFs/PNGs
├── requirements.txt
├── LICENSE              # MIT
└── README.md

What is released

  • 160 triples: 80 steered + 80 prompt, 10 personas × 8 information needs.
    • Prompt triples (with full query text): results/phase2a_prompt_baseline.json.
    • Steered triples: regenerated by scripts/phase2a_prompt_baseline.py with --mode steered, or extracted from the paired-session JSONs (see Reproducing the steered condition).
  • Both query generators: prompt conditioning via scripts/phase2a_prompt_baseline.py, residual-stream activation steering via the recipe in the steered runner.
  • Evaluation code for all six experiments (E1–E6) and the two NDCG / coherence passes.
  • Static viewer: viewer/index.html displays each persona's query and the top-3 retrieved documents per retriever; no model runs in the browser.

Quick start

python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Reproduce phase-2a prompt baseline (80 triples, 5 retrievers, 5k QReCC)
python3 scripts/phase2a_prompt_baseline.py \
    --qrecc data/raw/qrecc/qrecc_train.json \
    --n-passages 5000

# Run all robustness experiments
bash scripts/run_all_e1_e5.sh

Each runner writes its output to results/<experiment>.json and prints a one-line summary.

Reproducing the steered condition

Steered queries are generated by the script using Gemma 2 2B-it with a layer-12 residual-stream mean-difference vector. The vector is fit on a held-out corpus of axis-extreme utterances (jargon, impatience, query specificity, evidence-seeking, clarification tolerance) and applied with magnitude α = 1.5. The 80 steered queries used in the paper are also stored as persona_pos_text fields in the paired-session JSONs released alongside this repository.

Static viewer

viewer/index.html is a single static page that loads viewer/data.js and lets you click through each (persona, need) pair to see the query and the top-3 retrieved documents per retriever. No model runs in the browser; it's a tool for inspection.

Citation

@inproceedings{persona-ir,
  author    = {Anonymous},
  title     = {Persona-Driven Variance in {IR} Retrieval},
  booktitle = {Anonymous Submission},
  year      = {2026}
}

License

MIT — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors