system online · 8 recent runs
AI Quality Control
Score, ground, and ship AI outputs with evidence.
Overall quality
81.1%
weighted across evaluators
Active projects
5
under evaluation
Recent runs
8
last 30 days
Failed cases
1
needs review
High-risk runs
2
investigate
Pipeline health
end-to-end stage status across recent runs
Input
100
5 projects
Rubric Engine
100
6 active rubrics
Scoring
82
8 runs evaluated
Claim Pipeline
87
24 claims processed
Safety Layer
88
1 findings flagged
Human Review
88
1 in queue
Reports
100
8 runs exportable
Quality breakdown
average score per evaluator · last 8 runs
Deterministic Checks0.92/1.0
6 cases · pass 100%
LLM Judge0.87/1.0
51 cases · pass 92%
Claim Pipeline0.85/1.0
15 cases · pass 80%
Recent runs
latest five
Shadow — Daily Reflection
May 26, 09:14 AM · 40 cases · system-prompt-v3.1
0.8
Shadow — Daily Reflection
May 19, 10:02 AM · 40 cases · system-prompt-v3.0
regression0.7
RAG — Internal Docs QA
May 25, 02:30 PM · 30 cases · retrieval-topk-8
0.9
RAG — Internal Docs QA
May 20, 11:15 AM · 30 cases · retrieval-topk-4
0.8
Area Mosa — Booking Assistant
May 24, 08:45 AM · 25 cases · tone-friendly-v2
0.9