Popular repositories Loading
-
mental-health-llm-eval
mental-health-llm-eval PublicOpen evaluation harness for mental health LLM responses. 5 clinically-grounded rubrics, LLM-as-judge with bias controls, crisis-detection routing to 988 protocols.
Python
-
inspect_evals
inspect_evals PublicForked from UKGovernmentBEIS/inspect_evals
Collection of evals for Inspect AI
Python
-
awesome-ai-eval
awesome-ai-eval PublicForked from Vvkmnn/awesome-ai-eval
☑️ A curated list of tools, methods & platforms for evaluating AI reliability in real applications
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

