Skip to content

jumbo-q/ampo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AMPO

AMPO teaser

Adaptive Mix Preference Optimization for Generative Recommendation

Accepted by SIGIR 2026 (Full Paper)


Overview

AMPO introduces an adaptive margin mechanism for pairwise preference optimization. Rather than applying a uniform margin to every preference pair, it calibrates optimization strength according to model confidence, making training more stable under heterogeneous recommendation signals and varying pair difficulty.

The repository is organized for direct experimentation, with a lightweight training entry and a compact implementation path for extending optimization objectives in practical recommendation settings.


Setup

git clone https://github.com/jumbo-q/ampo.git
cd ampo
pip install -r requirements.txt

Quick Start

  1. Modify Configuration: Edit configs/default.yaml and fill in your <model_path> and <data_path>.

    model_args:
      model_name_or_path: "<your_model_path>"
    data:
      train_files:
        - "<your_train_data_path>"
  2. Launch Training: Run the provided shell script. It supports both single-GPU and multi-GPU training via DeepSpeed.

    # Multi-GPU training (default: 8 GPUs)
    bash scripts/train.sh
    
    # Single-GPU training
    NUM_GPUS=1 bash scripts/train.sh

Note: scripts/train.sh uses DeepSpeed for distribution by default. You can override the GPU count and config file path using NUM_GPUS and CONFIG_FILE environment variables.

Data Format

AMPO expects pairwise preference data with the following logical fields:

Column Description
prompt input context or user history
chosen preferred response / item
rejected non-preferred response / item

Example:

{
  "prompt": "User history ...",
  "chosen": "Preferred item",
  "rejected": "Rejected item"
}

Training Entry

The default workflow is intentionally minimal:

  • implement the core optimization logic in src/ampo/trainer.py
  • define training and evaluation flow in main_ampo.py

Customization

AMPO supports custom loss extension while preserving the standard tokenization, collation, logging, and optimization pipeline.


Citation

To be released

License

Apache License 2.0

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors