🐍 Reference

Pipeline Scripts Reference

18 Python and Go infrastructure scripts that power the CodeSleuth AI pipeline. These scripts handle environment bootstrapping, file write safety, parallel execution, intelligence, database safety, pipeline operations, and post-build verification.

âš ī¸
Script path convention

All script paths use /home/user/codesleuth/scripts/ — the agents directory, not the project directory. Always use absolute paths when calling scripts from within a project. Examples in this reference use relative python scripts/ notation for readability; resolve to absolute paths in practice.

Environment & Bootstrap

bootstrap.py

Validates toolchain, installs dependencies, and generates .env.example. Run at Agent 3 activation before any build tasks proceed.

DetailValue
Usagepython scripts/bootstrap.py --project-dir . --tdd artifacts/build/TDD.md
Exit 0Environment validated, deps installed, .env.example generated
Exit 1Blocked — missing toolchain component; error message identifies what is missing

spec_validate.py

Validates TASK-GRAPH.md consistency before the !build gate opens. Catches structural issues before the Builder begins execution.

CheckDescription
task_typeEvery task has a valid task_type field (api, auth, data, infra, ui, test, docs)
output_filesEvery task declares at least one output_files entry
TC statusTest contracts reference valid task IDs
Circular depsDependency graph has no cycles
Parallelism mapEvery task in the parallelism map exists in the task graph
python scripts/spec_validate.py --task-graph artifacts/build/TASK-GRAPH.md

Exit 0 = pass. Exit 1 = fail with specific error messages per failing check.

File Write Pipeline

Every file write by the Builder passes through a 5-step pipeline: secret scan → staged write → hot compilation → SAST scan → write log. Each step is a separate script.

staged_write.py

Stage a file write with a terminal diff and an Apply / Edit / Reject prompt. The human sees exactly what will change before any file is touched.

DetailValue
Usagepython scripts/staged_write.py stage --task TASK-NNN --file path --content /tmp/proposed.txt
Exit 0Write applied — user chose Apply or Edit-then-Apply
Exit 1Write rejected — repair context printed to stdout; Builder re-enters implementation step
Exit 2Error — file lock, permissions issue, or malformed arguments

secret_scan.py

Pre-write secret scanner. Runs before every file write and hard-blocks the write if any secret pattern is found. Repair context is returned to the Builder.

DetailValue
Usagepython scripts/secret_scan.py <target> [--log path]
Patterns14 regex patterns including OpenAI keys, Anthropic keys, AWS credentials, GitHub tokens, Stripe keys, Twilio tokens, and generic high-entropy strings
Exit 0Clean — no secrets detected
Exit 1Found — secret location and pattern name reported; write is blocked

write_log.py

Records every approved write to WRITE_LOG.jsonl. This log is consumed by commit_by_task.py to group commits by task.

Sub-commandDescription
recordAppend a write entry: task ID, file path, timestamp, hash
listShow all writes recorded in this session
taskShow all writes for a specific task ID
python scripts/write_log.py record --task TASK-NNN --file path/to/file.py

sast_scan.py

Post-write SAST scan. Runs non-blocking after each approved write. Uses gosec, bandit, or semgrep when available, with an inline regex fallback. Findings accumulate in SAST_FINDINGS.jsonl for the Security agent to review.

DetailValue
Usagepython scripts/sast_scan.py <target> --task TASK-NNN
ExitAlways exits 0 (non-blocking). Findings written to SAST_FINDINGS.jsonl.
Backendsgosec (Go), bandit (Python), semgrep (multi-language), inline regex fallback

check_coverage.py

Enforces a minimum coverage threshold after test contracts pass. A result below the threshold blocks DONE marking for the task.

DetailValue
Usagepython scripts/check_coverage.py --threshold 70 --task TASK-NNN
Supported stacksgo (go test -cover), node (nyc/c8), python (pytest-cov), rust (cargo-tarpaulin)
Exit 0Coverage at or above threshold
Exit 1Coverage below threshold — current % and required % reported

Parallel Execution

parallel_runner.go

Go goroutine batch executor. Runs multiple tasks simultaneously using goroutines and automatically detects write conflicts before execution begins.

DetailValue
Usagego run scripts/parallel_runner.go -tasks TASK-001,TASK-002 -cmd "..." -task-graph artifacts/build/TASK-GRAPH.md
Conflict detectionReads FILE_OWNERSHIP_MAP.md; refuses to run tasks that write to the same file in the same batch
Exit 0All tasks in the batch passed
Exit 1One or more tasks failed; per-task output captured and reported
â„šī¸
Two parallel execution methods

Method A (Go goroutines): parallel_runner.go — best for independent build/test commands. Method B (Agent tool calls): Agent 3 issues multiple parallel tool calls in a single response — best for independent file writes across non-overlapping task groups.

Intelligence

semantic_search.py

Embedding-based RAG search over the codebase. Agent 3 queries this before each architecture-class task to find relevant prior code and avoid duplication.

Sub-commandDescription
indexBuild semantic index from source files; writes to SEMANTIC_INDEX.json
searchQuery the index; returns top-N relevant file snippets
updateIncrementally update the index after new files are written
python scripts/semantic_search.py search --query "validate user input" --index artifacts/build/SEMANTIC_INDEX.json

Uses Ollama nomic-embed-text when available; falls back to TF-IDF if Ollama is not running.

Database

migration_safety.py

Schema delta detection, bidirectional SQL generation, and data-loss risk assessment. HIGH-risk migrations block the pipeline until !migration-approve is issued.

Sub-commandDescription
generateDiff current schema against previous; generate UP and DOWN SQL
checkClassify risk level (LOW / MEDIUM / HIGH / DATA-LOSS) and print report
dry-runApply migration to a shadow database and verify it succeeds without data loss
âš ī¸
HIGH risk blocks pipeline

When migration_safety.py check returns a HIGH or DATA-LOSS risk rating, the pipeline halts and prints a risk report. The human must issue !migration-approve to proceed or !migration-cancel to discard the schema change.

Pipeline Operations

pipeline_replay.py

Restore the codebase to the git checkpoint created after a specific task, then reset all later tasks to PENDING. Requires commit_by_task.py to have run, so each task has a corresponding git commit.

python scripts/pipeline_replay.py --task TASK-NNN --project-dir .

Invoked by the !replay TASK-NNN Orchestrator command. See Command Reference.

session_fork.py

Fork the pipeline into two parallel branches for side-by-side comparison of alternative implementations or architectures.

Sub-commandDescription
createFork the project directory into two branch directories (fork-a, fork-b)
compareDiff the two branches and summarize structural differences
mergeCopy the winning branch back to the project directory and remove both fork dirs
statusShow current fork state: which branch is ahead, task counts, test results

Invoked by !fork and !merge Orchestrator commands.

register_project.py

Registers or updates a project entry in config/context.json → registered_projects. Called by Agents 3–6 at each stage completion to keep the registry current, and by Agent 7 after every maintenance cycle. Enables the always-resident pipeline — registered projects auto-resume on the next Start agent 0 without requiring re-activation.

DetailValue
Usagepython scripts/register_project.py --project-dir /path --project-name "MyApp" --stage built --last-agent 3 --handoff artifacts/build/HANDOFF.json
Stagesbuilt → critiqued → secured → shipped → maintenance
Stage protectionRefuses to regress a stage — calling with an earlier stage than the current value is a no-op with a warning
Safety checkBlocks registration if --project-dir resolves inside the agents directory
Exit 0Project registered or updated successfully
Exit 1Safety violation (path inside agents dir) or config file missing/corrupt

pipeline_improve.py

Post-run telemetry analyzer. After a pipeline completes, analyzes timing, failure rates, and retry patterns to generate PROP-NNN improvement proposals.

InvocationDescription
python scripts/pipeline_improve.pyDefault: analyze current session and emit proposals to PROP_LOG.md
python scripts/pipeline_improve.py --healthShow existing proposals ranked by impact score
python scripts/pipeline_improve.py --promote PROP-NNNApply a specific proposal to the agents directory

Invoked by !pipeline-health and !pipeline-promote commands.

token_tracker.py

Session token usage tracking and !budget reporting. Reads per-model pricing from config/pricing.json.

Sub-commandDescription
recordRecord token counts for a completed agent turn
budgetPrint session summary: total tokens, context window %, estimated cost, per-agent breakdown
breakdownPer-task token breakdown with cache hit/miss analysis

Pricing data from config/pricing.json supports multiple Claude models with separate input/output/cache rates.

Post-Build

container_verify.py

Build the Docker image and run the full test suite inside the container. Auto-generates a Dockerfile.test if none exists. A passing test suite that fails inside the container indicates a genuine environment mismatch.

DetailValue
Usagepython scripts/container_verify.py --project-dir .
Exit 0Image built and all tests pass inside container
Exit 1Tests fail in container — container logs and diff vs. local results printed

Invoked by the Verifier agent in Phase 12B. See Verifier.

contract_test.py

Start the application, call every endpoint declared in INTERFACES.md with minimal requests, and validate status codes and response fields. Provides runtime API contract verification beyond static type checking.

python scripts/contract_test.py --interfaces artifacts/build/INTERFACES.md --base-url http://localhost:8080

Invoked by the Verifier agent in Phase 6D.

commit_by_task.py

Read WRITE_LOG.jsonl, group all writes by task ID, create one conventional git commit per task, and generate PR_BODY.md with a structured pull-request description.

python scripts/commit_by_task.py --project-dir .
OutputDescription
Git commitsOne conventional commit per task (feat:, fix:, test:, etc.) in task order
PR_BODY.mdStructured PR description with task list, test summary, and change log

Also required by pipeline_replay.py — each task needs a commit reference for checkpoint restoration.