Score model outputs against explicit rubrics with an Azure-backed judge, deterministic aggregation, and diffable JSON reports.
metrics evaluation regression-testing regression-analysis scorecards azure-openai llmops llm-evals offline-evals
-
Updated
Apr 15, 2026 - Python