Architecture Overview
CodeSleuth AI is built around four core concepts: the RARV cycle, versioned handoffs, a structured artifact directory, and a compounding knowledge base. Understanding these makes every agent's behavior predictable.
The RARV Cycle
Every agent follows the same four-step micro-loop before producing any output:
No agent produces output that skips the Reflect or Verify step. This prevents silent failures and half-complete tasks.
Complexity Tiers
Every project is automatically assigned a complexity tier at Discovery. The tier governs how many security domains are reviewed, how many test contracts are required, and which performance budget thresholds apply.
| Tier | Criteria | Security Domains | TC Requirement |
|---|---|---|---|
| Simple | Single platform, no auth, no payments, no external APIs | 5 (LOW only) | 1 TC per task minimum |
| Standard | Auth, database, 1â2 platforms, standard API integrations | HIGH+ domains | 2 TC per task minimum |
| Complex | Multi-platform, payments, multi-tenant, ML/AI components | All 20 domains | Full test contract suite |
Two-Phase Planning
Agent 2 (Technical Planning) produces all implementation artifacts in a single pass triggered by !plan.
| Output | Human Action | Purpose |
|---|---|---|
| TDD, INTERFACES, SCHEMA, TASK-GRAPH, contracts/ | !build | Full implementation blueprint with test contracts, Zod schemas, and OpenAPI stubs. |
Phase A catches scope disagreements early â when they're cheap to fix. Without it, the Planner could spend thousands of tokens on full contracts before the human realizes the task count is wrong or a platform was missed.
Spec Change Protocol
Mid-build scope changes go through a formal review before any implementation changes. This prevents silent drift between the spec and the codebase.
-
1Human issues !change [description]Describes what needs to change â a new feature, a removed feature, or a design pivot.
-
2Orchestrator produces Blast Radius ReportLists every affected task, file, contract, and test. Estimates scope as Small / Medium / Large.
-
3Human approves with !change-approve or cancels with !change-cancelNo implementation changes happen until approved.
-
4Affected tasks marked STALE; CHANGE_LOG.md updatedBuilder detects STALE tasks and re-plans them before executing. The changelog is permanent record.
Artifact Directory Structure
All pipeline artifacts are written relative to the active project directory, never the agents directory itself.
Pattern Library
The Pattern Library is a compounding knowledge base. Every time an agent solves a non-obvious problem, it captures the solution as a PATTERN-NNN entry. Future projects query the library before writing new code, preventing the same problem from being solved twice.
| Field | Purpose |
|---|---|
domain | Technology area (e.g., auth, database, payments) |
stack | Specific libraries involved |
context | When this pattern applies |
pattern | The implementation â code, config, or approach |
pitfall_avoided | What goes wrong without this pattern |
reuse_signal | Keywords that should trigger lookup |
evidence | Source project or commit reference |
Use !pattern-add to capture the current implementation as a pattern. The Planner automatically queries PATTERN_INDEX.md at build start.
Machine-Readable Contract Layer
The Planner emits two types of contracts that serve as ground truth for the Builder and Verifier:
- Zod schemas at
artifacts/build/contracts/[module].contracts.tsâ runtime-validated type shapes for every major data model and API response - OpenAPI stubs at
artifacts/build/contracts/openapi.yamlâ endpoint signatures, request/response shapes, and authentication requirements
The Builder validates its output against these contracts after every task. The Verifier re-runs contract conformance independently in Phase 6C.
HANDOFF.json â Versioned State
Agents communicate via a single HANDOFF.json file that is progressively enriched as the pipeline advances. Each agent reads the current version and writes the next.
| Version | Written by | Key additions |
|---|---|---|
| v1 | Discovery | spec_hash, platforms, complexity_tier, tech_stack |
| v2 | Planning | task_count, contracts_path, schema_hash, stack_versions, component_library |
| v3 | Builder | tasks_completed, tasks_stale, patterns_used, perf_budget_path |
| v4 | Critic | scorecard_mean, scorecard_verdict, critical_flaws, next_action |
| v5 | Security | security_tier, domains_reviewed, critical_findings, blockers |
| v6 | Verifier | ship_verdict, perf_verdict, contract_verdict, visual_baseline |
Parallel Task Execution
The Planner annotates every task with four parallelism fields that the Builder uses to schedule concurrent work:
| Field | Values | Purpose |
|---|---|---|
task_type | api | auth | data | infra | ui | config | feature | cicd | obs | Determines which scripts and Pattern Library queries apply |
depends_on | [TASK-NNN] or "none" | Explicit dependency edges â drives batch ordering |
can_parallel | true | false | Whether this task may run concurrently with sibling tasks in the same batch |
output_files | [list of files] | Used for write-conflict detection between parallel tasks |
The TASK-GRAPH ends with a Parallelism Map that groups tasks into dependency-ordered batches. Tasks with no shared output_files and can_parallel: true are placed in the same batch.
Method A â Go Goroutines (parallel_runner.go)
go run scripts/parallel_runner.go \
-tasks TASK-003,TASK-004,TASK-005 \
-task-graph artifacts/build/TASK-GRAPH.md \
-output artifacts/telemetry/batch_results.json
Before launching goroutines, parallel_runner.go compares output_files across the batch. Tasks that write to the same file are automatically split into sequential sub-batches â no manual intervention required.
Method B â Parallel Sub-Agents
Multiple Agent(...) calls in a single Claude Code response execute concurrently. The Builder uses this for architecture-class tasks that benefit from independent context windows. Results are merged back into the main task graph on completion.
Pipeline Scripts Integration
18 Python and Go scripts are wired directly into agent workflows. Each script is called at a specific point in the pipeline â agents do not call them ad-hoc.
| Script | Called by | Pipeline point |
|---|---|---|
bootstrap.py | Agent 3 | !build activation |
spec_validate.py | Agent 2 | Before !build unlocks |
staged_write.py | Agent 3 | Before every file write |
secret_scan.py | Agent 3 | Pre-write (blocks on detection) |
write_log.py | Agent 3 | After each approved write |
sast_scan.py | Agent 3 | After each write is applied |
check_coverage.py | Agent 3, Agent 6 | After TC tests pass; Phase 3A |
parallel_runner.go | Agent 3 | Multi-task batches |
semantic_search.py | Agent 3 | Before every arch-class task |
migration_safety.py | Agent 2, Agent 3 | Schema changes |
pipeline_replay.py | Agent 0 (!replay) | On demand |
session_fork.py | Agent 0 (!fork) | On demand |
pipeline_improve.py | Agent 3, Agent 0 | Post-build, post-run |
container_verify.py | Agent 6 | Phase 12B (before final SHIP) |
contract_test.py | Agent 6 | Phase 6D (live endpoint validation) |
commit_by_task.py | Agent 6 | Post-SHIP |
register_project.py | Agents 3â6, Agent 7 | Build completion and each stage advancement |
token_tracker.py | All agents | After every turn |
See Pipeline Scripts for the full reference including parameters, exit codes, and integration notes.
HANDOFF Version Evolution
Each agent enriches HANDOFF.json with fields its downstream consumers require. The full v1âv6 field progression:
| Version | Written by | Key additions |
|---|---|---|
| v1 | Discovery | project, platform, user_stories, capability_intelligence, design_contract |
| v2 | Planning | stack_versions, task_graph_path, parallelism_map, bootstrap, component_library |
| v3 | Builder | build_status, task_checkpoints, coverage_pct, secret_blocks, semantic_index_path |
| v4 | Critic | critique_verdict, scorecard_mean, market_context |
| v5 | Security | security_verdict, finding_counts, dependency_intelligence_summary |
| v6 | Verifier | verification_verdict, coverage_verdict, runtime_contract_testing, container_verification, review_rubric_score |
See HANDOFF.json Schema for the complete field-by-field specification for all six versions.
Performance Budget
Every project template ships with .agent/PERF_BUDGET.json defining thresholds:
{
"bundle": { "total_kb": 500, "initial_chunk_kb": 200 },
"api": { "p50_ms": 200, "p95_ms": 800, "timeout_ms": 3000 },
"build": { "max_warnings": 0 }
}
The Verifier runs scripts/perf.mjs to measure actual values and compare against budget. A budget violation is a NO-SHIP blocker. Use --update flag to rebaseline after deliberate changes.