Darwin component map

This page is the ground truth for the PRD-to-PR pipeline: which Darwin components are wired in today, which are useful but parked in a separate vertical, and which are outstanding. The prioritized "outstanding" list in §(c) is the build queue.

The audit pass that produced this map ran 2026-04-30 across be-agent-service/agents/prompts/, be-agent-service/apps/server/, beta-appcaire/.claude/agents/, beta-appcaire/scripts/, beta-appcaire/.github/workflows/, and beta-appcaire/docs/dossiers/. Refreshed 2026-04-30 evening after the runner end-to-end validation.

(a) Used by PRD-to-PR pipeline (engineering vertical)

These components are invoked by the pipeline today. Components marked (2026-04-30) moved here from §(c) after shipping.

Component	Location	Pipeline use
`POST /api/prd-to-pr` + 9-stage runner (2026-04-30)	`be-agent-service/apps/server/{routes/prd-to-pr.ts,pipeline/runner.ts}`	The state machine that walks every stage. Persists transitions to `prd_to_pr_runs` in `agent-service.db`. Honors `MAX_REVIEW_CYCLES=3`, `MAX_CONCURRENT_PRD_RUNS=10`, `MAX_RUNS_PER_DAY=30`. Re-entry edges 5→4 and 6→4.
Stage modules (2026-04-30)	`be-agent-service/apps/server/src/pipeline/stages/{intake,prd,spec,tests,implement,review,verify,reviewerLoop,merge}.ts`	One per stage. `intake.ts` does worktree creation + env replication + `yarn install` + `yarn generate` + per-server `db:generate`. `tests.ts` WIP-commits the architect's spec and the test-writer's tests so the editor's `git stash` can't destroy them. `implement.ts` runs the iteration loop with transactional stash. `verify.ts` runs the dossier bundler then re-enters `implement` on Playwright failure with a `verifyHint`.
Dossier bundler (2026-04-30)	`be-agent-service/apps/server/src/pipeline/dossier.ts`	Captures Playwright trace + final-state screenshot, filters console output, emits `summary.json` + a human-readable `README.md`, writes everything to `docs/dossiers/<feature>/` and stages a commit. Stage 6 fails if any required artifact is missing.
Named agent prompts (2026-04-30)	`be-agent-service/agents/prompts/engineering/{architect,test-writer,editor,verifier,reviewer-feedback}.md`	Spawned via `claude -p` in their respective stages. `architect.md` is route-locked (must copy `/`, `/home`, etc. verbatim from PRD). `editor.md` carries the iteration contract.
L1 dashboard pages (2026-04-30)	`be-agent-service/apps/dashboard/src/pages/PRDToPR{Page,DetailPage}.tsx`	Run list, detail view with live-ticking elapsed clock, per-stage timing table fed from `summary_json.stageHistory[]`, artifact viewer (PRD body / spec / tests / diff / dossier), Approve / Reject buttons. Stage log endpoint globs `<stage>.*.log` so every agent invocation surfaces.
Codex / CodeRabbit polling (2026-04-30)	`be-agent-service/apps/server/src/pipeline/stages/reviewerLoop.ts`	Runs `gh push` then polls `gh api repos/{org}/{repo}/pulls/<n>/comments`. Parses P1/P2/P3 severity. Re-enters `implement` on unresolved P1/P2; argues down via PR description text where applicable.
Auto-kill switch + Max Plan OAuth (2026-04-30)	`be-agent-service/apps/server/src/pipeline/env.ts`	`enforceAutoKillSwitch()` blocks new runs past concurrency / daily caps; HTTP 429 surfaced. `isMaxPlanPreferred()` reads `CLAUDE_USE_MAX_PLAN=1` from `~/.config/caire/env` and strips `ANTHROPIC_API_KEY` from spawned subprocess env so `claude -p` falls through to the macOS Keychain OAuth credentials.
Three review subagents — `resolver-reviewer`, `dashboard-reviewer`, `perf-reviewer`	`beta-appcaire/.claude/agents/.md` (tracked via a `.gitignore` negation rule that re-includes `.claude/agents/*`)	Stage 5. Wired and invoked in parallel before stage 6 verify. Routing decided by the architect's spec metadata.
`engineering/orchestrator` agent prompt	`be-agent-service/agents/prompts/engineering/orchestrator.md`	Coordinates the compound nightly workflow. Not invoked by the PRD-to-PR runner — the runner.ts state machine plays that role.
`engineering/backend-specialist`, `frontend-specialist`, `db-architect-specialist`, `infrastructure-specialist`, `ux-designer-specialist`	`be-agent-service/agents/prompts/engineering/*.md`	Available to Cursor Composer / Claude Code in a worktree (the L0 path). Not invoked by the PRD-to-PR runner — that path uses the named `editor.md` agent.
`engineering/senior-code-reviewer`, `verification-specialist`	`be-agent-service/agents/prompts/engineering/*.md`	Same: L0 path.
`management/interface-agent`	`be-agent-service/agents/prompts/management/interface-agent.md`	Target-state stage 0 / stage 8 Telegram bridge. Already brokers Telegram traffic for ad-hoc requests; needs a routing rule to recognise PRD-style messages — see §(c).1 below.
`management/cpo-cto`	`be-agent-service/agents/prompts/management/cpo-cto.md`	Target-state cross-team router and quarterly model-routing ratifier (per model-and-vendor-agnosticism.md).
Darwin runtime	`be-agent-service/apps/server/src/index.ts:42` (port 3010)	HTTP host for `/api/prd-to-pr/*`. Also serves `/api/repos`, `/api/agents`, `/api/workspace`, `/api/jobs`, `/api/schedules`, `/api/metrics`.
SQLite state	`.compound-state/agent-service.db`	Persists `prd_to_pr_runs` rows + per-stage `stageHistory[]` in `summary_json`. Throughput log per throughput-and-business-signals.md.
Launchd job slots	`be-agent-service/scripts/manage-darwin-dashboard-launchd.sh`	Hosts the unified server in production. Same job slot also runs schedule research and compound nightly.
Compound nightly workflow	`be-agent-service/scripts/compound/{auto-compound,daily-compound-review,loop,analyze-report}.sh`	Separate vertical (priorities-driven nightly review). Pre-existing; not replaced by the PRD-to-PR runner.
Worktree scripts	`beta-appcaire/scripts/git/worktree-{add,remove}.sh`	Stage 0 worktree creation. Mandatory for PRD-driven work per the standing memory rule.
GitHub Merge Queue	`beta-appcaire/.github/workflows/pr-checks.yml` (`merge_group:` triggers, `dashboard-server tests (required for merge)` aggregator)	Stage 8 gate. Live; the pipeline targets it via `gh pr merge --auto --squash`.
`docs/dossiers/<feature>/`	`beta-appcaire/docs/dossiers/*`	Stage 6 storage. Now populated automatically by the dossier bundler. Pre-existing hand-crafted dossiers (e.g. `visit-chain-scope`) remain as historical reference.
`wiki/plans/<feature>-YYYY-MM-DD.md` PRD convention	`beta-appcaire/wiki/plans/*`	Stage 1 input. Convention is in active use; humans write PRDs here.

(b) Parked — separate verticals, not used by PRD-to-PR

These components are part of Darwin but solve different problems. They should not be merged into the PRD-to-PR pipeline; mixing verticals produces ambiguous orchestration. Listed here so the reader can rule them out without re-doing the audit.

Component	Location	Why it's not in the PRD-to-PR pipeline
Marketing team — `jarvis-orchestrator`, `shuri-product-analyst`, `fury-customer-researcher`, `vision-seo-analyst`, `loki-content-writer`, `quill-social-media`, `pepper-email-marketing`, `wanda-designer`, `friday-developer`, `wong-notion-agent`	`be-agent-service/agents/prompts/marketing/*.md`	Marketing/content vertical with its own pipeline (`loki-content-writer.sh` etc.) and its own `humanizer` skill. Different intake, different output.
Optimization team — `optimization-mathematician`, `timefold-specialist`	`be-agent-service/agents/prompts/optimization/*.md`	Schedule research vertical. Reads benchmark history, proposes constraint-weight changes. Consumes the throughput log produced by the PRD-to-PR pipeline (per throughput-and-business-signals.md) but is not part of it.
`management/ceo`, `management/hr-agent-lead`	`be-agent-service/agents/prompts/management/*.md`	Strategic / org concerns. Useful for budget allocation per vision commitment (d), but not invoked per-PR.
Hannes Dashboard	`be-agent-service/scripts/manage-hannes-dashboard-launchd.sh`	Separate launchd job for a separate stakeholder view. Distinct concern from the engineering Dashboard at `localhost:3010`.
Schedule research loop	`be-agent-service/apps/server/src/routes/schedule-runs/*`	Optimization vertical's runtime. Reuses Darwin's launchd slots and SQLite, but is not feature delivery.

(c) Outstanding — the prioritized build queue

Items 1–5 from the original queue (pipeline runner, dossier bundler, Codex polling, named agent prompts, L1 dashboard UI) shipped between commits 92bc142 and 0270e90 and have moved to §(a). What remains:

1. Telegram → `POST /api/prd-to-pr` routing — ~1 day

File: be-agent-service/agents/prompts/management/interface-agent.md plus a small router under be-agent-service/apps/server/src/routes/telegram/.
What it does: detect PRD-style Telegram messages (URL to a wiki/plans/*.md file or an attached PRD), POST to /api/prd-to-pr, post the dossier README.md + screenshot.png back to the originating Telegram thread, accept "approve" / "reject" replies. This is the L2 human interface.
Why first: thin shell over the now-shipped L1 backend. The hardest part is the message-classification rule; the round-trip plumbing is one HTTP POST and one Telegram callback.

2. Runner orphan resilience — ~half a day

Files: every stage module that spawns claude -p — pipeline/stages/{spec,tests,implement,review,reviewerLoop}.ts.
What it does: add detached: true, stdio: ['ignore', 'pipe', 'pipe'] to the spawn() options so spawned claude -p children survive a tsx watch reload of the unified server. Also persist child PIDs in the row and add a "resume from last completed stage" code path in runner.ts.
Why second: today, editing any server file mid-run kills the in-flight pipeline because tsx watch SIGTERM's the parent and propagates to children. Caused 3 lost runs during the dogfood validation. Easy fix; high return.

3. Per-iteration live telemetry inside stage 4 — ~half a day

File: be-agent-service/apps/server/src/pipeline/stages/implement.ts.
What it does: introduce a ctx.persist() callback (or import updatePrdRunStage directly with a small refactor to break the circular import) so the editor's iteration count, regression hint, and stash state land in the row mid-stage instead of only at stage exit. Today the dashboard's "Stage running: 4m 12s" indicator updates live via stage_started_at, but iteration_count only updates after the whole implement stage returns.
Why third: observability gap, not a correctness one. Run #12 stage 4 ran for 4m 50s with iter=0 displayed the entire time — the dashboard's "stuck?" badge can't fire correctly without this.

4. Token / cost accounting

What it does: parse --output-format=json from claude -p invocations, sum input_tokens + output_tokens per stage, persist into tokens_total + per_role_breakdown on the row.
Why deferred: Max Plan OAuth (the current default) is flat-rate billing — per-call cost is meaningless and the dashboard correctly shows —. Implementing token accounting is mostly useful when API-key billing is reactivated for a specific run or model. Add a CLI flag --meter that opts a single run into JSON-output parsing rather than instrumenting unconditionally.

5. (Defer) Model adapter + scale-or-kill / GrowthBook ramp

The model adapter from model-and-vendor-agnosticism.md is a non-trivial migration of every existing Darwin agent. Don't block items 1–4 on it; introduce the adapter in parallel and migrate the cheapest / highest-volume role first (probably cheap via Ollama).
The scale-or-kill agent from scale-or-kill.md is gated behind throughput data. The pipeline now produces real summary_json.stageHistory data — once a few real PRDs (not just banner-hello) have shipped, the agent has something to read.

How to use this map

Building the next missing piece? Pick the lowest-numbered item in §(c) that isn't yet in flight, scope it, open a PRD under wiki/plans/, and run it through the L0 path.
Adding a new agent to Darwin? Decide whether it's part of the engineering vertical (§a), a separate vertical (§b), or a missing piece of the PRD-to-PR pipeline (§c). Update this map in the same PR. The map is the audit; if the audit drifts, the wiki lies.
Reading this for the first time? Skim §(a) to see what's real, skim §(c) to see what's next. §(b) exists so you can stop wondering whether loki-content-writer is somehow part of the pipeline (it isn't).

Cross-references

README — orientation for the agentic-workflow section, including the L0 / L1 / L2 human interface tiers.
PRD-to-PR pipeline — the eight-stage pipeline this map provisions for.
Darwin as orchestrator — orchestrator-runtime decision and end-to-end flow.
Agent roles and model routing — five named roles and their closest analogues today.
Verification and evidence — dossier format the bundler in §(c) must produce.
Reviewer feedback loop — Codex polling specifics for the loop in §(c).
Compound workflow — the precursor pipeline already running nightly.