Problem
Planning-style grids fail in three layers at once: the pure calculation, the HTTP contract, and what the browser renders. If each layer is tested in isolation with different numbers, regressions slip through as “the UI looks fine.”
The goal is a faux release train: one canonical grand total per scenario, verified in Vitest on the engine, again via JSON Schema on the API response, and again in Chromium E2E against the DOM—without floating-point surprises on money-shaped totals.
Approach
- Three golden workspace fixtures—baseline, fx-shift, rounding-stress—with expectedGrandTotal as string decimals and shared Zod validation across engine, API, and tests.
- Pure evaluateWorkspace(fixture) in packages/engine: no I/O in core; decimal.js (or equivalent) with documented rounding mode and scale.
- POST /v1/evaluate on Fastify with Zod I/O, committed openapi.yaml, and structured errors; X-Request-Id on every request for traceability.
- Minimal React (Vite) grid: fixture select, debounced recalc, AbortController to cancel stale requests.
- Playwright @smoke: expect.poll on data-testid="grand-total"; string equality only—no Number() coercion on totals.
- GitHub Actions: unit → contract → E2E; Playwright HTML report uploaded on failure.
Layered flow
packages/fixtures/*.json (baseline | fx-shift | rounding-stress) → packages/engine evaluateWorkspace (Vitest) → packages/api POST /v1/evaluate (supertest + Ajv / OpenAPI) → packages/web React grid (Playwright @smoke) → CI: pnpm test → contract → test:e2e (artifact on fail)
Definition of done
- Identical grandTotal strings in engine tests, HTTP contract tests, and E2E for all three fixtures.
- docs/release-smoke.md documents the ~3-minute @smoke demo path; README includes a short interview script.
- Mutation discipline: contract tests fail if response shape drifts from OpenAPI (documented drill).
Scope & safety
No authentication in this portfolio slice—focus is correctness and observability. Public API inputs are validated with Zod; fixture ids are allowlisted.
Request IDs propagate from browser to logs so debugging matches support-style narratives without PII in the demo fixtures.
What I'd extend next
Richer measure DAG (dependent accounts) once the golden pipeline is boringly green.
Testcontainers or a real Postgres slice if the API grows beyond in-memory fixture maps.
Broader E2E matrix only after @smoke is stable—avoid flake before expanding surface area.
Repository
Monorepo layout: packages/engine, packages/api, packages/web, packages/fixtures; root playwright + GitHub Actions.
Link TBD when the public repo is published.