Vibe Bench
Benchmarks evaluating long-horizon tasks that require sustained human-AI interaction
Pinned Loading
Repositories
Showing 2 of 2 repositories
- VibeSearchBench.github.io Public
VibeBench/VibeSearchBench.github.io’s past year of commit activity - VibeSearchBench Public
🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evaluation. No vibes, just triplet F1.
VibeBench/VibeSearchBench’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…