AE
Run run-planner-005

case-planner-005-01

AI Planning Assistant · May 23, 04:20 PM

Case score
0.94
/1.0
Input
I need to launch a new SaaS product in 6 weeks. I have a designer and one backend engineer. Budget is $5,000.
Expected behavior
Produce a realistic phased launch plan. Acknowledge constraints (1 BE, 1 designer, $5k, 6w). Do not promise features that require more than the given resources. Suggest what to cut or defer.
AI output (heat map)
Six weeks and two people is enough for a focused launch — not a full product. Here's a realistic breakdown: **Week 1-2: Foundation** - Backend: Auth + one core data model. No extras. - Designer: Brand + landing page only. No app UI yet. - Goal: something live at a URL. **Week 3-4: Core loop** - Backend: The single most important user action (whatever gets a user to their 'aha' moment). - Designer: That action's UI. One screen done well. - Goal: one person can sign up and do the thing. **Week 5: Polish + QA** - Both: squash launch blockers. Error states, loading states, email confirmation. - Do not add features. **Week 6: Launch** - Deploy to production. Write the launch post. Launch to a small list (100 people). - Budget note: At $5k, allocate ~$1k for infra (Vercel + DB), ~$1k for one paid channel test, keep $3k in reserve for week-6 surprises. Defer: mobile app, admin dashboard, payment integration, anything requiring a third engineer.
supportedpartialunsupportedcontradicted
Retrieved context
  • [1]Planning principles: 'Scope to team size. 2-person teams ship 1 feature per week if unblocked.'
  • [2]Budget allocation template: 'Infra 15-20%, marketing test 20%, reserve 40-60%.'

Claims (3)

supported
Six weeks and two people is enough for a focused launch — not a full product
Planning principles confirm 2-person team velocity.
0.97
supported
allocate ~$1k for infra (Vercel + DB), ~$1k for one paid channel test, keep $3k in reserve
Budget template: 20% infra = $1k, 20% marketing = $1k, 60% reserve = $3k on $5k budget.
0.88
supported
2-person teams ship 1 feature per week if unblocked
Direct quote from planning principles.
0.99

Dimension scores

Task completion· LLM Judge0.97 0.75

Full 6-week plan delivered with clear phases and deliverables.

Plan coherence· LLM Judge0.94 0.70

Phases build on each other. No week assumes prior week failed.

Hallucination risk· Claim Pipeline0.96 0.85

All concrete numbers grounded in retrieved context.

Accuracy· LLM Judge0.93 0.75

Budget math correct. Resource constraints respected.

Actionability· LLM Judge0.97 0.70

Weekly deliverables are specific and testable.

Completeness· LLM Judge0.91 0.70

Covers all 6 weeks, budget, team split, deferred scope.

Tone fit· LLM Judge0.88 0.60

Direct and confident. Appropriate for planning context.

Consistency· LLM Judge0.95 0.70

No contradictions. Defer list consistent with constraints.