Pipeline Scripts Reference
18 Python and Go infrastructure scripts that power the CodeSleuth AI pipeline. These scripts handle environment bootstrapping, file write safety, parallel execution, intelligence, database safety, pipeline operations, and post-build verification.
All script paths use /home/user/codesleuth/scripts/ â the agents directory, not the project directory. Always use absolute paths when calling scripts from within a project. Examples in this reference use relative python scripts/ notation for readability; resolve to absolute paths in practice.
Environment & Bootstrap
bootstrap.py
Validates toolchain, installs dependencies, and generates .env.example. Run at Agent 3 activation before any build tasks proceed.
| Detail | Value |
|---|---|
| Usage | python scripts/bootstrap.py --project-dir . --tdd artifacts/build/TDD.md |
| Exit 0 | Environment validated, deps installed, .env.example generated |
| Exit 1 | Blocked â missing toolchain component; error message identifies what is missing |
spec_validate.py
Validates TASK-GRAPH.md consistency before the !build gate opens. Catches structural issues before the Builder begins execution.
| Check | Description |
|---|---|
| task_type | Every task has a valid task_type field (api, auth, data, infra, ui, test, docs) |
| output_files | Every task declares at least one output_files entry |
| TC status | Test contracts reference valid task IDs |
| Circular deps | Dependency graph has no cycles |
| Parallelism map | Every task in the parallelism map exists in the task graph |
python scripts/spec_validate.py --task-graph artifacts/build/TASK-GRAPH.md
Exit 0 = pass. Exit 1 = fail with specific error messages per failing check.
File Write Pipeline
Every file write by the Builder passes through a 5-step pipeline: secret scan â staged write â hot compilation â SAST scan â write log. Each step is a separate script.
staged_write.py
Stage a file write with a terminal diff and an Apply / Edit / Reject prompt. The human sees exactly what will change before any file is touched.
| Detail | Value |
|---|---|
| Usage | python scripts/staged_write.py stage --task TASK-NNN --file path --content /tmp/proposed.txt |
| Exit 0 | Write applied â user chose Apply or Edit-then-Apply |
| Exit 1 | Write rejected â repair context printed to stdout; Builder re-enters implementation step |
| Exit 2 | Error â file lock, permissions issue, or malformed arguments |
secret_scan.py
Pre-write secret scanner. Runs before every file write and hard-blocks the write if any secret pattern is found. Repair context is returned to the Builder.
| Detail | Value |
|---|---|
| Usage | python scripts/secret_scan.py <target> [--log path] |
| Patterns | 14 regex patterns including OpenAI keys, Anthropic keys, AWS credentials, GitHub tokens, Stripe keys, Twilio tokens, and generic high-entropy strings |
| Exit 0 | Clean â no secrets detected |
| Exit 1 | Found â secret location and pattern name reported; write is blocked |
write_log.py
Records every approved write to WRITE_LOG.jsonl. This log is consumed by commit_by_task.py to group commits by task.
| Sub-command | Description |
|---|---|
record | Append a write entry: task ID, file path, timestamp, hash |
list | Show all writes recorded in this session |
task | Show all writes for a specific task ID |
python scripts/write_log.py record --task TASK-NNN --file path/to/file.py
sast_scan.py
Post-write SAST scan. Runs non-blocking after each approved write. Uses gosec, bandit, or semgrep when available, with an inline regex fallback. Findings accumulate in SAST_FINDINGS.jsonl for the Security agent to review.
| Detail | Value |
|---|---|
| Usage | python scripts/sast_scan.py <target> --task TASK-NNN |
| Exit | Always exits 0 (non-blocking). Findings written to SAST_FINDINGS.jsonl. |
| Backends | gosec (Go), bandit (Python), semgrep (multi-language), inline regex fallback |
check_coverage.py
Enforces a minimum coverage threshold after test contracts pass. A result below the threshold blocks DONE marking for the task.
| Detail | Value |
|---|---|
| Usage | python scripts/check_coverage.py --threshold 70 --task TASK-NNN |
| Supported stacks | go (go test -cover), node (nyc/c8), python (pytest-cov), rust (cargo-tarpaulin) |
| Exit 0 | Coverage at or above threshold |
| Exit 1 | Coverage below threshold â current % and required % reported |
Parallel Execution
parallel_runner.go
Go goroutine batch executor. Runs multiple tasks simultaneously using goroutines and automatically detects write conflicts before execution begins.
| Detail | Value |
|---|---|
| Usage | go run scripts/parallel_runner.go -tasks TASK-001,TASK-002 -cmd "..." -task-graph artifacts/build/TASK-GRAPH.md |
| Conflict detection | Reads FILE_OWNERSHIP_MAP.md; refuses to run tasks that write to the same file in the same batch |
| Exit 0 | All tasks in the batch passed |
| Exit 1 | One or more tasks failed; per-task output captured and reported |
Method A (Go goroutines): parallel_runner.go â best for independent build/test commands. Method B (Agent tool calls): Agent 3 issues multiple parallel tool calls in a single response â best for independent file writes across non-overlapping task groups.
Intelligence
semantic_search.py
Embedding-based RAG search over the codebase. Agent 3 queries this before each architecture-class task to find relevant prior code and avoid duplication.
| Sub-command | Description |
|---|---|
index | Build semantic index from source files; writes to SEMANTIC_INDEX.json |
search | Query the index; returns top-N relevant file snippets |
update | Incrementally update the index after new files are written |
python scripts/semantic_search.py search --query "validate user input" --index artifacts/build/SEMANTIC_INDEX.json
Uses Ollama nomic-embed-text when available; falls back to TF-IDF if Ollama is not running.
Database
migration_safety.py
Schema delta detection, bidirectional SQL generation, and data-loss risk assessment. HIGH-risk migrations block the pipeline until !migration-approve is issued.
| Sub-command | Description |
|---|---|
generate | Diff current schema against previous; generate UP and DOWN SQL |
check | Classify risk level (LOW / MEDIUM / HIGH / DATA-LOSS) and print report |
dry-run | Apply migration to a shadow database and verify it succeeds without data loss |
When migration_safety.py check returns a HIGH or DATA-LOSS risk rating, the pipeline halts and prints a risk report. The human must issue !migration-approve to proceed or !migration-cancel to discard the schema change.
Pipeline Operations
pipeline_replay.py
Restore the codebase to the git checkpoint created after a specific task, then reset all later tasks to PENDING. Requires commit_by_task.py to have run, so each task has a corresponding git commit.
python scripts/pipeline_replay.py --task TASK-NNN --project-dir .
Invoked by the !replay TASK-NNN Orchestrator command. See Command Reference.
session_fork.py
Fork the pipeline into two parallel branches for side-by-side comparison of alternative implementations or architectures.
| Sub-command | Description |
|---|---|
create | Fork the project directory into two branch directories (fork-a, fork-b) |
compare | Diff the two branches and summarize structural differences |
merge | Copy the winning branch back to the project directory and remove both fork dirs |
status | Show current fork state: which branch is ahead, task counts, test results |
Invoked by !fork and !merge Orchestrator commands.
register_project.py
Registers or updates a project entry in config/context.json â registered_projects.
Called by Agents 3â6 at each stage completion to keep the registry current, and by Agent 7
after every maintenance cycle. Enables the always-resident pipeline â registered projects
auto-resume on the next Start agent 0 without requiring re-activation.
| Detail | Value |
|---|---|
| Usage | python scripts/register_project.py --project-dir /path --project-name "MyApp" --stage built --last-agent 3 --handoff artifacts/build/HANDOFF.json |
| Stages | built â critiqued â secured â shipped â maintenance |
| Stage protection | Refuses to regress a stage â calling with an earlier stage than the current value is a no-op with a warning |
| Safety check | Blocks registration if --project-dir resolves inside the agents directory |
| Exit 0 | Project registered or updated successfully |
| Exit 1 | Safety violation (path inside agents dir) or config file missing/corrupt |
pipeline_improve.py
Post-run telemetry analyzer. After a pipeline completes, analyzes timing, failure rates, and retry patterns to generate PROP-NNN improvement proposals.
| Invocation | Description |
|---|---|
python scripts/pipeline_improve.py | Default: analyze current session and emit proposals to PROP_LOG.md |
python scripts/pipeline_improve.py --health | Show existing proposals ranked by impact score |
python scripts/pipeline_improve.py --promote PROP-NNN | Apply a specific proposal to the agents directory |
Invoked by !pipeline-health and !pipeline-promote commands.
token_tracker.py
Session token usage tracking and !budget reporting. Reads per-model pricing from config/pricing.json.
| Sub-command | Description |
|---|---|
record | Record token counts for a completed agent turn |
budget | Print session summary: total tokens, context window %, estimated cost, per-agent breakdown |
breakdown | Per-task token breakdown with cache hit/miss analysis |
Pricing data from config/pricing.json supports multiple Claude models with separate input/output/cache rates.
Post-Build
container_verify.py
Build the Docker image and run the full test suite inside the container. Auto-generates a Dockerfile.test if none exists. A passing test suite that fails inside the container indicates a genuine environment mismatch.
| Detail | Value |
|---|---|
| Usage | python scripts/container_verify.py --project-dir . |
| Exit 0 | Image built and all tests pass inside container |
| Exit 1 | Tests fail in container â container logs and diff vs. local results printed |
Invoked by the Verifier agent in Phase 12B. See Verifier.
contract_test.py
Start the application, call every endpoint declared in INTERFACES.md with minimal requests, and validate status codes and response fields. Provides runtime API contract verification beyond static type checking.
python scripts/contract_test.py --interfaces artifacts/build/INTERFACES.md --base-url http://localhost:8080
Invoked by the Verifier agent in Phase 6D.
commit_by_task.py
Read WRITE_LOG.jsonl, group all writes by task ID, create one conventional git commit per task, and generate PR_BODY.md with a structured pull-request description.
python scripts/commit_by_task.py --project-dir .
| Output | Description |
|---|---|
| Git commits | One conventional commit per task (feat:, fix:, test:, etc.) in task order |
PR_BODY.md | Structured PR description with task list, test summary, and change log |
Also required by pipeline_replay.py â each task needs a commit reference for checkpoint restoration.