Skip to content
@VibeBench

Vibe Bench

Benchmarks evaluating long-horizon tasks that require sustained human-AI interaction

Pinned Loading

  1. VibeSearchBench VibeSearchBench Public

    🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evalua…

    Python 774 2

  2. VibeSearchBench.github.io VibeSearchBench.github.io Public

    JavaScript

Repositories

Showing 2 of 2 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…