๐Ÿ”จ Agent 3

Application Builder

The Builder is a continuous execution engine. It reads TASK-GRAPH.md, implements tasks one by one, validates output against contracts, tracks change impact, and never voluntarily pauses until every task is DONE or a genuine blocker is hit.

โ„น๏ธ
The Builder never stops voluntarily except for:

sudo/root access required ยท blocked for 5+ consecutive attempts ยท required secret is missing ยท task is marked STALE and needs human re-approval ยท SHIP declaration issued

Task Status Values

StatusMeaning
PENDINGNot yet started; dependencies not satisfied
IN-PROGRESSCurrently being implemented
DONEImplemented, tested, and gate commands passed
BLOCKEDCannot proceed; reason documented; awaiting human input
STALEPreviously DONE but invalidated by a spec change (!change). Must be re-derived before re-execution.
โš ๏ธ
STALE tasks require human review

When the Builder encounters a STALE task, it presents the original implementation and the change context, then waits for human confirmation before re-implementing. A STALE task is never automatically re-executed.

Per-Task Execution Cycle

  1. 1
    Read task and dependencies
    Confirm all declared dependencies are DONE. Read the test contract and acceptance criteria before writing any code.
  2. 2
    Query Pattern Library
    Search PATTERN_INDEX.md for the task's domain and stack keywords. Apply any matched patterns before writing new code.
  3. 3
    Implement & write tests
    Write tests first when TC entries exist. Implement feature code. Never mark DONE without passing test contracts.
  4. 4
    Validate against contracts (step 10b)
    If the task touches an API endpoint or data model, run Zod schema validation and verify OpenAPI stub conformance. A contract failure blocks DONE marking.
  5. 5
    Run gate commands
    pnpm agent:check (or stack equivalent) must exit zero. UI tasks additionally require pnpm agent:visual.
  6. 6
    Change Impact Check (step 13b)
    Cross-reference modified files against FILE_OWNERSHIP_MAP.md. If any other task owns a modified file, re-run that task's targeted tests.
  7. 7
    Mark DONE & emit status line
    Update TASK-GRAPH.md status, write status line to CHECKPOINT.md. Proceed to the next PENDING task.

Change Impact Tracking

After every task, the Builder performs a Reverse CIA (Change Impact Analysis):

  1. Collect every file modified in this task
  2. Look each file up in FILE_OWNERSHIP_MAP.md
  3. Find all other tasks that own those same files
  4. Re-run the targeted test commands for each affected task
  5. If any prior test fails, the affected task is marked STALE and the Builder halts to report the regression
โ„น๏ธ
Why Change Impact Tracking?

A task that passes its own tests can still break a previously DONE task that shares a database model, utility function, or API client. Without this check, regressions accumulate silently and surface only at Verifier time โ€” when they're expensive to fix.

Status Line Format

After each task, the Builder emits a structured status line to CHECKPOINT.md:

--- Task 5: User Authentication โ†’ DONE โœ…
    TC: 3/3 pass (1 unit, 2 integration)
    Contracts: โœ… Zod schema valid | OpenAPI: โœ… conformant
    Regression: โœ… no cross-task failures
    Gates: โœ… lint  โœ… types  โœ… tests  โœ… build
    Patterns applied: PATTERN-001 (NextAuth v5 + Prisma sessions)
    Notes: Used database sessions per PATTERN-001; JWT mode disabled.

Checkpoint Cadence

Every 3โ€“5 completed tasks, the Builder writes a full checkpoint to CHECKPOINT.md:

  • Tasks completed since last checkpoint
  • Tasks remaining (PENDING count)
  • BLOCKED tasks and their reasons
  • STALE tasks and their change context
  • Patterns applied in this batch
  • Cross-task regression status

DECISIONS.md

When the Builder makes a non-obvious implementation choice that differs from the plan or represents a trade-off, it logs an entry to DECISIONS.md:

## DECISION-007: Used optimistic locking instead of row-level locks
Date: 2025-06-01
Task: TASK-018
Context: Row-level locks caused deadlocks under concurrent booking requests in load testing.
Decision: Switch to optimistic locking with retry (max 3 attempts).
Trade-off: Slightly higher retry overhead vs. elimination of deadlock risk.
Pattern: Candidate for PATTERN-007.

Builder Rules (Key)

RuleDescription
Rule 1Never mark a task DONE without all TC tests passing
Rule 2Never skip pnpm agent:check with bypass flags
Rule 3Never write code for a task whose dependencies are not DONE
Rule 4Never auto-resolve a STALE task โ€” wait for human re-approval
Rule 5Check Pattern Library before implementing any auth, database, or payment integration
Rule 6Contract validation (Zod + OpenAPI) is mandatory for all API-touching tasks
Rule 7Change Impact Check runs after every task, not only after suspected regressions
Rule 8Log every non-obvious decision to DECISIONS.md

Pipeline Script Integration

Every file write by the Builder passes through a 5-step pipeline powered by dedicated scripts. These steps run automatically โ€” no manual invocation required.

Step 1: Pre-write Secret Scan

Before any file is written, secret_scan.py scans the proposed content against 14 regex patterns.

python scripts/secret_scan.py <file>

Patterns include OpenAI keys, Anthropic keys, AWS access keys, GitHub tokens, Stripe keys, Twilio tokens, and generic high-entropy strings. If a match is found, the write is hard-blocked and repair context is returned to the Builder for remediation before retrying.

Step 2: Write Gate โ€” Staged Write

After the secret scan passes, the proposed write is staged for human review with a colored terminal diff.

python scripts/staged_write.py stage --task TASK-NNN --file path/to/file --content /tmp/proposed.txt

The human selects one of three options:

  • Apply โ€” write proceeds; write_log.py record is called immediately after
  • Edit โ€” human modifies the proposed content; revised version is applied
  • Reject โ€” write is refused; exit 1 is returned with repair context; Builder re-enters the implementation step for this task

Step 3: Post-write Hot Compilation

Immediately after each approved write, the Builder runs a fast compile check before proceeding to tests.

StackCommand
Gogo build ./...
TypeScript / Node.jsnpx tsc --noEmit
Rustcargo check
Pythonpython -m py_compile <file>

If compilation fails, the Builder re-enters the implementation step without proceeding to the test or SAST steps. The hot check catches type errors and syntax mistakes at the lowest possible cost.

Step 4: Post-write SAST Scan

After the hot compilation check passes, a non-blocking SAST scan runs in the background.

python scripts/sast_scan.py <file> --task TASK-NNN

The scan always exits 0 and never blocks the Builder. Findings are appended to SAST_FINDINGS.jsonl. The Security agent reads this file during its 20-domain review, giving it a complete history of static analysis findings across all task writes.

Step 5: Post-write Write Log

Every approved write is recorded to WRITE_LOG.jsonl for later consumption by commit_by_task.py.

python scripts/write_log.py record --task TASK-NNN --file path/to/file.py

The write log enables commit_by_task.py to group all writes by task and create one conventional git commit per task. It also provides the checkpoint references needed by pipeline_replay.py for !replay operations.

Parallel Task Execution

The Builder supports two parallel execution methods for tasks that have no write-file overlap.

Method A: Go Goroutines

For independent build and test commands, parallel_runner.go runs tasks simultaneously using Go goroutines.

go run scripts/parallel_runner.go -tasks TASK-001,TASK-002 -cmd "pnpm test" -task-graph artifacts/build/TASK-GRAPH.md

Before executing, the runner reads FILE_OWNERSHIP_MAP.md and refuses any batch where two tasks write to the same file. Per-task output is captured and reported on completion. Exit 0 means all tasks passed; exit 1 means at least one failed.

Method B: Parallel Agent Tool Calls

For independent file writes across non-overlapping task groups, Agent 3 issues multiple parallel tool calls in a single response. This leverages Claude Code's native parallel execution rather than an external process. Method B is preferred for write-heavy tasks; Method A is preferred for test-heavy tasks.

โ„น๏ธ
Write-conflict check is mandatory before any parallel batch

Before starting either method, the Builder cross-references the output_files fields of all candidate tasks against FILE_OWNERSHIP_MAP.md. Tasks that share output files are never run in parallel โ€” they are sequenced instead.

Coverage Enforcement

After all TC test contracts pass for a task, check_coverage.py runs to verify the 70% coverage threshold.

python scripts/check_coverage.py --threshold 70 --task TASK-NNN

If coverage is below threshold, the task cannot be marked DONE. The Builder adds missing test cases until coverage meets or exceeds 70%, then re-runs the check. The threshold applies per-task, not globally โ€” a single poorly-covered task blocks its own DONE status without affecting other tasks.

StackCoverage backend
Gogo test -cover ./...
Node.js / TypeScriptnyc or c8
Pythonpytest-cov
Rustcargo-tarpaulin

Coverage is also checked by the Verifier in Phase 3A as a project-wide gate. A task that barely meets 70% locally may still contribute to a project-wide shortfall caught at verification time.