Datasets
Versioned test sets. Save the questions a rubric generates as a dataset, then re-run it across model and prompt versions for apples-to-apples regression.
Datasets need Supabase. Apply
0003_datasets.sql and set the env.No datasets yet. On New run, generate questions then Save as dataset.