Skip to content

feature(sunjx): add rejection sampling in grm_training#38

Open
Jiaxuan-Sun wants to merge 1 commit into
opendilab:mainfrom
Jiaxuan-Sun:feature/t2i-rejective-sampling-0206
Open

feature(sunjx): add rejection sampling in grm_training#38
Jiaxuan-Sun wants to merge 1 commit into
opendilab:mainfrom
Jiaxuan-Sun:feature/t2i-rejective-sampling-0206

Conversation

@Jiaxuan-Sun

Copy link
Copy Markdown
Contributor

Rejection Sampling for GRM Training

This directory contains scripts and tools for preparing rejection sampling training data and training GRM (Generative Reward Model) models on both text-to-image (T2I) and text-to-video (T2V) tasks.

Overview

Rejection sampling is a technique to filter high-quality training samples by:

  1. Running inference on a dataset using a trained GRM model
  2. Filtering correctly predicted samples (where model prediction matches ground truth)
  3. Converting filtered samples into training format with Chain-of-Thought (CoT) reasoning
  4. Training the model on these high-quality filtered samples

@Jiaxuan-Sun

Copy link
Copy Markdown
Contributor Author

Ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant