A 1980s software engineering discipline, reborn as the missing scaffold for LLM-driven coding agents. Every left-hand specification is the definition of a right-hand test — written before the agent writes code.
The default agentic loop is: prompt → code → "looks good" → ship. There is no spec to drift from and no test to fail. The bug surfaces three weeks later in production.
auth.py now import redis?"Build me a URL shortener."
→ 400 lines of code.
→ It runs. Maybe.
→ No test names a requirement.
Spec → test stubs → arch → integration stubs → modules → unit tests → code → climb back up the V.
Horizontal dashed lines = each spec defines the test on its right.
A specification you can't write a test for isn't a specification. It's a vibe.
LLM agents are excellent at producing artifacts that satisfy local constraints (this function compiles) and poor at preserving global intent (does the system do what was asked?).
The V-Model forces every level of intent to be written as a falsifiable artifact before code generation. The artifact on the left literally becomes the test fixture on the right.
Tests stop being an afterthought and become the contract that the agent's code must satisfy. The agent is no longer guessing what "done" means.
prd.md with Gherkin-style user stories.
Agent also generates tests/acceptance/*.feature as empty stubs.
"Given a long URL, when I POST /shorten, then I receive a 7-char slug."
docs/arch.md: services, data flow,
tech choices, ADRs. Agent generates tests/system/*.spec stubs that hit the public API surface.
tests/integration/* stubs
that exercise modules together with fakes at the seams.
tests/unit/*
stubs from the types.
# Phase 1 — Requirements add docs/prd.md add tests/acceptance/shorten.feature # stub # Phase 2 — System Design add docs/arch.md add docs/adr/001-storage.md add tests/system/api.spec.ts # stub # Phase 3 — Architectural Design add src/api/, src/store/, src/hash/ # empty add tests/integration/api_store.test.ts # stub # Phase 4 — Module Design add src/hash/index.ts # signatures only add tests/unit/hash.test.ts # stub # NO IMPLEMENTATION YET.
Each commit is reviewable on its own. The full intent of the system exists as text and test stubs before a single function body is written.
| Test fails | You return to |
|---|---|
| Unit | Module Design (Phase 4) |
| Integration | Architectural Design (Phase 3) |
| System | System Design (Phase 2) |
| Acceptance | Requirements (Phase 1) |
This is the V-Model's quiet superpower: a failure tells you which design document was wrong, not just which line of code. For agents, that's a far more useful error signal than a stack trace.
Each step is one agent turn with one reviewable artifact. No code is written before step 5.
"Authenticated users shorten URLs. Slugs are 7 chars. /s/<slug> 302-redirects."
Single Node service, Postgres, base62 hash of incrementing ID. ADR for why not nanoid.
api/ · store/ · hash/ · auth/. Interfaces typed. Store is a port; Postgres is one adapter.
encode(id:number):string · decode(slug:string):number|Err. Errors are types, not strings.
Bottom of the V. Agent writes implementations bounded by signatures & unit tests.
11 unit tests run on every save. Mutation testing run nightly. Coverage is a side-effect, not a goal.
API↔Store↔Hash wired together with an in-memory store. Catches the bug where slugs collide on race.
Real Postgres in Docker. Real HTTP. JWT auth flow. Rate-limit headers correct.
The 4 .feature files from step 01 run end-to-end. If they're green, the requirement is met. Ship.
Working software + 4 design docs + 18 tests, each traceable to a specific line of intent.
docs/ or tests/. No artifact = no progress.// @spec prd.md#FR-2.1.docs/, tests/{acceptance,system,integration,unit}/, src/, adr/.# /prompts/phase-04-module-design.md ROLE You are operating in PHASE 4 of the V-Model: Module Design. You may not write function bodies. INPUTS - docs/prd.md - docs/arch.md - src/**/*.d.ts (existing signatures) DELIVERABLES 1. Add or refine TypeScript signatures in src/ 2. Define error types as discriminated unions 3. For each new signature, emit a unit-test stub in tests/unit/ tagged // @spec <file>#<anchor> CONSTRAINTS - No function bodies. throw new Error("todo") only. - Every exported symbol must be tested. - Every test stub must reference a spec anchor. DONE WHEN - pnpm typecheck passes - pnpm test --reporter=list shows N todo tests - git diff has no implementation lines # end
Writing all four design docs before any code. The V is a per-feature loop, not a project phase. Slice vertically — one user story end-to-end through the V at a time.
Code drifts; docs don't follow. Mitigation: spec anchors are referenced from tests; broken anchor = CI failure.
The agent writes tests that pass trivially. Defend with mutation testing (Stryker, mutmut) and the rule: every assertion must reference a spec anchor.
Agent slips implementation into a design session. Defend with phase-locked prompts (slide 08) and a pre-commit hook that rejects src/**/*.ts bodies in design-phase commits.
The V suits well-scoped features and safety-critical work. For exploratory prototyping, run a stripped two-rung V (Requirements ↔ Acceptance only) and graduate to the full V when the idea survives.
| V&V | Verification & Validation. "Built it right" vs. "built the right thing." |
| ADR | Architecture Decision Record. One file per non-obvious choice, with context + consequence. |
| Gherkin | Given/When/Then DSL — turns a user story into an executable spec. |
| Port / Adapter | Interface that lets you swap an implementation (real DB ↔ in-memory fake). |
| Mutation testing | Mutates your code; tests should fail. Catches assertions that don't assert. |
| Spec anchor | Stable id (heading or comment) that a test cites — e.g. prd.md#FR-2.1. |