OmniTutor
▸ build roadmap · 10 phases · M1 / M2 / M3

Build it in ten steps. Then productize. Then scale.

10 atomic phases, each shippable, each tested. Land all 10 and you have M1: a real tutor that always feels fast. Then comes M2 (productization) and M3 (scale & differentiation).

Control census · what we owe the user

Every interactive surface across the runtime + modals + discovery. None decorative. By P10, every one functional.

RegionCountWhat
Top bar16Brand · Lesson ▾ · Save&exit · 4 mode segments · Level ▾ · 3 verbosity · Make me ▾ + 6 items · ⚙
Lesson dropdown7+ Start new · 6 row types (now / same chapter / recent)
Beat strip4Beat-number nav · Play · Pause · Stop
Whiteboard toolbar11Pen · Highlight · Erase · Text · 5 colors · Undo · Clear · Fullscreen · 3 board tabs
Ask area5Textarea · Ask AI · 🎙 · 📎 · 📷
I-want-to chips6Mode-aware · Explain · Example · Visualize · Animate · Quiz/Stuck/Skip/Look · Got it/Advance/Done
Coach board6Answer input · Submit · Hint · Show me · Why · Skip step
Test board3Options A–D · Submit · Skip
Adjust modal17X · 5-step direction · 9 axes · textarea · 🎙 · Cancel · Apply
Setup modal22X · 16 segments · textarea · advanced toggle + advanced inputs · skip · Plan
Plan modal13Back · audio play · 7 iterate chips · textarea · 🎙 · Send/Lock
Right rail7Mini owl · 2 tabs · plan-tree node toggles · going-further cards · refresh · status
Bottom actions2Adjust level · Start new lesson
Discovery~30Search · 🎙 · 4 lanes of tiles · recents · paths · sign-in · brand · pricing
Total unique controls~150All real by P10. None stubbed.

The ten phases

Each phase is atomic — testable on its own, ships a real working slice. Roughly 1 week each.

P1Foundationheavy
▸ scopeRepo skeleton · auth gate · design tokens (four-blue palette) · owl SVG component · KaTeX · deploy pipeline · logging · API key vault. No user-facing UI yet.
▸ design refdesign_language.html tokens · owl_brand.html SVG variants
▸ testHealthcheck endpoint live · owl SVG renders on a test page · hello-world Haiku call returns text
P2Discovery + 4 subject pageslight
▸ scopeomnitutor.ai static landing · 4 lanes · search bar (text input only) · 4 subject pages (Physics · Math · History · English) — all clickable, all navigate
▸ design refdiscovery.html + 4 subject_*.html
▸ testClick every tile → right subject page · click any item inside subject → routes to /setup?topic=X
P5Streaming pipeline · backendheavy
▸ scopeBeat 1 from Haiku · Beats 2–7 streamed from Sonnet in parallel · partial-render protocol · cache write-through · failure modes (retry · fallback · friendly degraded)
▸ design refstreaming_architecture.html
▸ testLatency budget (<3s to Beat 1) · streamed beats hydrate plan tree as they arrive · cache hit on second request <200ms
P6Active runtime shell + Teach modesheavy
▸ scopeTopbar · right rail · ask area · beat strip · playback widget · Teach-Concept board · Teach-Derivation board with verbosity toggle · ALL top-bar controls functional
▸ design reflesson_runtime_v12.html#active Teach scenarios
▸ testBeat strip navigates · playback plays/pauses TTS · verbosity toggle changes density · all topbar items real
P7Coach + Test modesmedium
▸ scopeCoach: step input · submit · hint · show me · why · skip · all evaluating real answers · Step-counter prominent header · Test: MCQ · timer · submit · skip · score
▸ design refCoach + Test scenarios in runtime
▸ testWrong answers get hints · "Show me" walks through · timer ticks · grading works · all 9 controls real
P8Board mode + Visualize/Animatemedium
▸ scopeWhiteboard with all 11 tools · canvas drawing · Tutor watches & comments · Visualize chip generates SVG diagram on demand · Animate chip generates simulation · drama covers gen latency
▸ design refWhiteboard scenario + animated chip concept
▸ testDraw → Tutor responds · Visualize produces relevant diagram · Animate produces usable mini-sim
P9Adjust level + cross-cutting controlsheavy
▸ scopeAdjust modal fully functional · Level picker · Make me dropdown (all 6 outputs really generate) · Save & exit · Start new lesson · settings · status pulse · plan-tree expand · going-further refresh · iWant chips on every mode
▸ design refAdjust modal + topbar + right rail
▸ testPlaywright sweep on ~50 cross-cutting controls · "Make me flashcards" actually produces flashcards · adjust really changes level
P10Audio · lip-sync · cache · polishheavy
▸ scopeElevenLabs TTS streaming · owl mouth lip-syncs to amplitude · caching layer on (Canvas-A pattern · trending pre-warmed) · mobile-responsive sweep · accessibility (keyboard · ARIA) · final latency audit
▸ design refowl_brand.html lip-sync variants + streaming spec rules
▸ testCold-start ≤3s confirmed · cached <200ms · mobile passes · keyboard-only flow works · zero spinners under WebPageTest

Three guardrails

Apply to every phase. No relaxation.

▸ no stubs
No phase ships if any control on its surface is non-functional. "Skip step" stub is a fail. We complete what we touch.
▸ no spinners
No phase ships with a spinner. If a thing is slow, drama covers it. Latency budget is part of acceptance.
▸ tested
Each phase ends with an automated Playwright sweep hitting every control on its surface plus a manual checklist signed off.

The three milestones

M1 makes everything else worth doing. M2 opens the gates. M3 makes it scale & sing.

▸ M1 · ~10 weeks · the demo

It works · and never feels slow

All 10 phases shipped. Real tutor, real models, real audio. Demo-ready to a private cohort.

▸ in M1

  • All ~150 controls functional · zero stubs
  • Both invocation paths work (free-form text + landing tile)
  • Cold start ≤3s to Beat 1 · zero spinners · drama covers latency
  • Hot start (cached) <200ms
  • Audio TTS streams · owl lip-syncs to amplitude
  • 5 modes work: Teach-Concept · Teach-Derivation · Coach · Test · Whiteboard
  • Adjust level · Make me · Visualize/Animate chips · all real
  • Mobile-responsive · keyboard-accessible · ARIA basics
  • 4 subject pages curated (Physics · Math · History · English)
  • Anything else generated on-demand

▸ NOT in M1 (deferred)

  • Accounts (Google sign-in · profiles · persistent library)
  • Mobile native apps (web-PWA only)
  • Real Rive owl with viseme lip-sync (placeholder SVG)
  • Multi-user · classroom · teacher dashboard
  • Payments · subscriptions
  • Social (sharing · friend activity · peer paths)
  • Spaced repetition · long-term memory · teach-back
  • Full hands-free voice conversation mode
  • Offline mode
  • Pre-warmed cache at scale (catalog not pre-generated)
  • Subjects beyond the 4 (others on-demand only)
verdict~70% of "OmniTutor the company" done. Real, fast, shippable to a small private cohort. Can show investors, demo to teachers, validate the experience.
▸ M2 · ~6 weeks · productize

Open the gates

Make M1 safe to release publicly. Add the systems that turn a demo into a product.

▸ in M2

  • Accounts · Google + email sign-in · profiles · persistent library
  • Personal data layer · session history · progress tracking
  • Payments · Stripe · free tier + paid plans · subscription mgmt
  • Content safety · moderation · age-appropriate filtering
  • Observability · cost dashboard · latency dashboard · error tracking
  • Pre-warmed cache for top 100 trending lessons
  • Email · transactional + onboarding sequences
  • Privacy & legal · ToS · privacy policy · GDPR · COPPA-aware
  • Better mobile-web (responsive sweep beyond M1's basics)
  • Public landing page (marketing) separate from app

▸ NOT in M2

  • Native iOS / Android apps
  • Real Rive owl (still placeholder)
  • Classroom / teacher tools
  • Social features
  • Catalog beyond on-demand
  • Full voice-mode conversation
verdictPublic-launchable. Real users · real billing · real data. ~85% of company done.
▸ M3 · ~10 weeks · scale & sing

What makes it different

The features that make OmniTutor un-replicable. Native apps. The full owl. Classroom. Memory.

▸ in M3

  • Real Rive owl with full viseme lip-sync · animated illustration · multiple expressions
  • Native iOS & Android apps (SwiftUI + Compose) sharing the streaming engine
  • Full hands-free voice conversation mode
  • Spaced repetition · long-term memory · "what did we learn last week?"
  • Curated catalog scaled to ~500 subjects · pre-generated polished lessons
  • Classroom · teacher dashboard · multi-student management · assignments
  • Social · share lessons · peer paths · friend activity feed
  • Offline mode (PWA + native)
  • API for partners (textbook publishers · schools · LMS integration)
  • Internationalization (UI in 10+ languages)

▸ Beyond M3

  • Hardware (custom learning device?)
  • VR / AR experiences
  • Direct school B2B sales motion
  • Paid creator economy on top of OmniTutor
verdictDefensible · differentiated · scalable. The product Mukesh has been describing — fully formed, replicable only by re-living the design journey.