AE
AI Eval
quality + grounding lab
Search rubrics, runs, cases…
K
today
$0.00
/ $2
EM
Rubrics
/
New rubric
New rubric
Define dimensions, weights, methods, and safety gates for an evaluation rubric.
Rubric ID
*
?
Unique slug used in project references and API calls. Use kebab-case with version suffix, e.g. rag-qa-v2.0.
Version
*
?
Semantic version string. Increment on any dimension or weight change to keep history traceable.
Name
*
?
Human-readable display name shown in lists and reports.
Owner
*
?
Team or person responsible for maintaining this rubric and reviewing its results.
Project
?
Project this rubric evaluates. A rubric can be reused across versions of the same project.
— none —
Shadow — Daily Reflection
RAG — Internal Docs QA
Area Mosa — Booking Assistant
Customer Support Reply
AI Planning Assistant
Safety gates (comma-separated)
?
Hard-blocker gate IDs. If any gate triggers, the run is blocked regardless of weighted score. Common: pii_leakage, false_confirmation, medical_advice_without_disclaimer.
Dimensions
Σ = 1.00 ✓
Add dimension
Name
ID
Method
Weight
Threshold
Deterministic
LLM Judge
Semantic
Claim Pipeline
Human
Deterministic
LLM Judge
Semantic
Claim Pipeline
Human
Deterministic
LLM Judge
Semantic
Claim Pipeline
Human
Deterministic
LLM Judge
Semantic
Claim Pipeline
Human
Create rubric
Cancel
AE
Dashboard
Projects
Rubrics
Eval Runs
Datasets
Evaluators
Regression
Human Review
Reports
Wiki
Play
Safety Log