feat: add CRPE-Relation task#1354
Merged
Merged
Conversation
CRPE-Relation is a 7,576-item single-image MCQ on object/predicate/ subject relationships, drawn from The All-Seeing Project V2. Dataset: nv-njb/CRPE — a bundled re-host of the original OpenGVLab/CRPE annotations (which ship only the 544 abnormal_images/ JPEGs, while the remaining 5,400 records reference COCO val2017 by relative path). The re-host inlines all 1,081 unique images (537 COCO val2017 + 544 abnormal) as JPEG bytes under an Image() feature so the parquet loads end-to-end via standard load_dataset with no extra COCO download. Metric: exact_match (flexible-extract) on the MCQ letter. The filter parses inline A./B./C./D. choices out of the question text, then tries (1) leading uppercase letter, (2) substring-match against any choice text. Handles common reasoning wrappers (<think>...</think>, <answer>...</answer>).
kcz358
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds CRPE-Relation, a 7,576-item single-image MCQ on object / predicate / subject relationships drawn from The All-Seeing Project V2.
Dataset: `nv-njb/CRPE` — a bundled re-host of `OpenGVLab/CRPE`.
Why a re-host
The original `OpenGVLab/CRPE` repo ships the `crpe_relation.jsonl` annotation file alongside 544 `abnormal_images/` JPEGs, but the remaining 5,400 records reference COCO val2017 images by relative path — those JPEGs are not in the HF repo, so out-of-the-box `load_dataset` cannot resolve them.
The re-host inlines all 1,081 unique referenced images (537 from COCO val2017 + 544 from abnormal_images) as JPEG bytes under an `Image()` feature. Result: a self-contained parquet (~1 GB across 4 shards) that loads end-to-end via standard `load_dataset` — no extra COCO download needed.
Files
Parity vs. local fork
Qwen3-VL-2B-Instruct, full `test` split (7,576 items), 8x H100, greedy decoding.
Essentially identical — well within stderr.
Test plan