OmniTutor
▸ P1 LLD foundation · scaffolding the new repo

P1 · foundation

P0 cut us cleanly away from Canvas A and stood up the empty stack. P1 fills it with the foundation libs that every later phase consumes — but still no user-facing tutor functionality. Once P1 lands, P2 can light up the discovery + subject pages on top of a working harness. Contracts come from schema; the 5-section LLD template comes from p0_lld.html.
▸ phaseP1 · foundation
▸ statusdraft · pending sign-off
▸ depends onP0 signed off
▸ unblocksP2 (discovery + subjects)

1.Scope

By end of P1, the new omnitutor repo has the foundation libs that every later phase consumes: backend scaffolding (FastAPI app · DB pool · session middleware · log handler · model client · cost tracker · rate-limit middleware · moderation middleware), frontend scaffolding (design tokens · owl SVG component · KaTeX integration · base CSS · base JS state), CI/CD (GitHub Actions deploy · pytest + Playwright + k6 harnesses), and operational scaffolding (SSM secrets · CloudWatch logs · pg_dump backup cron · cost-cap cron). No user-facing surfaces exist yet — that's P2.

Acceptance demo: visit https://omnitutor.ai/v1/healthz · returns {ok:true}. Visit https://omnitutor.ai/v1/hello · gets a Haiku-generated greeting in <3s · audit shows the call logged in model_runs with cost · trace_id · latency.

2.Before / after

▸ end of P0

  • Empty omnitutor repo · just schema_v0.sql + README
  • Postgres tables exist · all empty
  • omnitutor.service serves only /healthz
  • S3 bucket exists · empty
  • No frontend · just /_design/*.html static review hub
  • No tests · no harness
  • No model calls wired

▸ end of P1

  • FastAPI app with full middleware stack (session · log · rate · moderate)
  • Anthropic + ElevenLabs clients wired · /v1/hello works
  • Every model call writes to model_runs automatically
  • Frontend serves tokens.css · owl SVG component · KaTeX
  • pytest + Playwright + k6 harnesses run green on CI
  • GitHub Actions deploys on push to main
  • pg_dump cron + cost-cap cron live · CloudWatch logs flowing

3.Work plan

Twelve ordered steps. Each verifies before moving to the next.

▸ ordered work · backend first, then frontend, then ops

  1. FastAPI skeleton: app/main.py with /v1/healthz · /v1/session · /v1/event · /v1/hello (test endpoint). Pydantic models for User · Session · ErrorEnvelope.
  2. DB pool + migrations: asyncpg connection pool. Alembic for migrations · seed with schema_v0.sql. Helper module db.py with get_user(anon_id), create_session(user_id), etc.
  3. Session middleware: reads ot_session HttpOnly cookie. If absent or expired, creates anonymous user + sets cookie. Attaches request.state.session. Renews on every request.
  4. Logging middleware: structlog with the 13-field dictionary from schema §9. Generates trace_id per request. Logs request start + end. CloudWatch sink.
  5. Model client: app/models/anthropic.py wrapping the SDK. Wraps every call in track_run() that writes a model_runs row · catches errors · maps to ErrorEnvelope. Same for ElevenLabs.
  6. Rate-limit middleware: reads rate_buckets · enforces §8 limits per scope+bucket. Returns over_quota when exceeded · sets retry_after_ms.
  7. Moderation middleware: wraps any endpoint that sends user text to a model. Calls Haiku safety classifier on input. Returns refused if blocked + logs safety.refused event.
  8. Frontend tokens: web/styles/tokens.css exporting all design-language vars (sky-100..900, sea, ink, gold, coral, mint, cream, type stack). Imported once in base layout.
  9. Owl SVG component: web/components/owl.html · drop-in <svg> partial with all viseme/state classes pre-wired. Same component used in modal, rail, favicon.
  10. KaTeX integration: CDN-loaded · auto-render .ttx spans on DOMContentLoaded. Single shared init script.
  11. CI/CD: .github/workflows/deploy.yml · on push to main: pytest → playwright headless → SSH to devbox → git pull + systemctl restart omnitutor. Pulls SSM secrets into /etc/omnitutor/env.
  12. Ops crons: cron/backup.sh nightly pg_dump → S3. cron/cost_cap.sh nightly compute spend · raise OT_OVER_QUOTA if ≥$50/day. cron/cache_prewarm.sh stub (real prewarm in P10).

4.Test plan

▸ acceptance tests

  1. Healthcheck: curl https://omnitutor.ai/v1/healthz → 200 with {ok:true,ts}
  2. Anonymous session: first POST /v1/session creates users + sessions rows · sets ot_session cookie · subsequent requests reuse it
  3. Hello-world Haiku: POST /v1/hello with {topic:"physics"} returns Haiku-generated greeting in <3s · model_runs row written with cost_usd + latency_ms + trace_id
  4. Logging fields: Tail journalctl -u omnitutor · every request's logs share one trace_id · all 13 dictionary fields populated
  5. Rate limit: 6 anonymous /v1/hello calls from one IP in an hour · 6th returns over_quota
  6. Moderation: obvious harmful prompt to /v1/hello returns refused · safety.refused event logged
  7. Cost cap: echo "OT_OVER_QUOTA=1" >> /etc/omnitutor/env · restart · next /v1/hello returns over_quota immediately, no model call
  8. Frontend tokens: a test page using tokens.css renders with the four-blue palette · KaTeX renders F=ma · owl SVG appears with bow tie + glasses
  9. CI deploy roundtrip: push a no-op commit to main · GitHub Actions run completes green · service restarts · healthcheck still 200 within 90s
  10. Backup: trigger cron/backup.sh manually · verify s3://omnitutor-assets/backups/<date>.sql.gz exists · restore into scratch DB · all 11 tables present
  11. SSE skeleton: GET /v1/lesson/test/stream with Last-Event-ID echo · returns SSE-shaped events with monotonic IDs (just plumbing, no real beats yet)
  12. Playwright sweep: spin up the test page · assert tokens load · assert owl visible · assert KaTeX rendered. CI runs this on every push.

▸ test endpoint & smoke fixture

# pytest fixture · spins ephemeral postgres + asgi client
@pytest.fixture
async def client():
    async with AsyncClient(app=app, base_url="http://test") as c:
        yield c

async def test_hello_world(client):
    r = await client.post("/v1/hello", json={"topic": "hello"})
    assert r.status_code == 200
    body = r.json()
    assert body["ok"] is True
    assert "greeting" in body
    assert body["latency_ms"] < 3000
    # DB-side: verify model_runs row written
    async with db.acquire() as conn:
        run = await conn.fetchrow("SELECT * FROM model_runs WHERE trace_id=$1", body["trace_id"])
        assert run["cost_usd"] > 0
        assert run["status"] == "ok"

5.Acceptance gate

P1 ships when every line below is true. P2 cannot start until then.

▸ P1 done · sign-off list

  1. All 12 work-plan steps (§3) green · code in main branch
  2. All 12 acceptance tests (§4) pass · CI green
  3. Hello-world Haiku call returns in <3s end-to-end · model_runs auditable
  4. Cost-cap and rate-limit verified by deliberate trip
  5. Moderation classifier verified on a known-bad prompt
  6. Backup restore drilled · <60s recovery
  7. Mukesh signs off · git tag v1.1-p1-shipped