Problem
Synthetic queries like:
“2 characters fight each other with guns”
may match multiple scenes in a video. Miners may return a different valid fight scene than the validator’s chosen ground-truth clip, causing unfair penalties.
Why It Matters
This directly affects:
- Miner fairness
- Validator scoring integrity
- Incentive alignment on Bittensor
Proposed Direction
- Enforce temporal grounding via additional constraints
- Use scene disambiguation hints (e.g., clothing, location)
- Penalize only if semantic mismatch > threshold
- Allow multiple valid ground-truth intervals
Acceptance Criteria
- Validator scoring does not unfairly penalize semantically correct alternative scenes.
- Evaluation metrics robust to multi-instance events.
Problem
Synthetic queries like:
may match multiple scenes in a video. Miners may return a different valid fight scene than the validator’s chosen ground-truth clip, causing unfair penalties.
Why It Matters
This directly affects:
Proposed Direction
Acceptance Criteria