Darwin as orchestrator | Agentic Workflow

The vision says humans define what; agents execute how. Darwin is the implementation: a single addressable entity (Telegram, eventually web UI) that takes a PRD and returns a merged PR with evidence. The human's only required action is reading the screenshot.

This is the path to replacing the human CTO with the agent-CTO described in vision-and-mandate.md.

End-to-end flow

1. Human posts PRD link to Telegram
       ↓
2. interface-agent receives, validates frontmatter, posts ack
       ↓
3. interface-agent → orchestrator (engineering team)
       ↓
4. orchestrator runs the 7-stage pipeline (PRD → PR with dossier)
       ↓
5. After merge, Darwin posts the dossier screenshot back to Telegram
       ↓
6. Human looks at the screenshot, replies "approve" or "reject"
       ↓
7a. approve → done. Scale-or-kill agent takes over for ramp.
7b. reject → re-open spec for clarification, return to stage 2.

What Darwin is, today

Per darwin-dashboard.md:

Unified API + UI at localhost:3010 (be-agent-service/apps/server/src/index.ts:42).
Endpoints: /api/repos/<repo>/{status,priorities,logs}.
Backed by the compound-state/agent-service.db sqlite.
Hosts the schedule research loop, the marketing team, and the engineering compound runs.

What's missing for Darwin-as-orchestrator:

/api/prd-to-pr/<feature> endpoint that accepts a PRD path and starts the pipeline.
Telegram → orchestrator routing for PRD-style messages (today messages route to general-purpose subagents, not the pipeline).
The pipeline itself (it's documented across this section but not yet a single executable pass).

Decision: orchestrator runtime is a thin in-house Node loop

Recorded 2026-04-30 to stop the next builder from relitigating this. The pipeline runner lives inside be-agent-service as a thin (~200 LOC) Node state machine, not as a LangGraph (Python) sidecar or any external workflow engine.

Reasoning:

Matches the existing be-agent-service/apps/server/src/routes/* style — no new language, no new process boundary.
Persists state in .compound-state/agent-service.db, which already exists and is already read by the throughput logger.
Reuses the launchd job slots that already host the schedule research loop and compound nightly workflow.
Each pipeline stage is a function that calls the model adapter (see model-and-vendor-agnosticism.md) or shells out (gh, playwright, yarn). No DAG library needed for eight linear stages with one re-entry edge (stage 7 → stage 4).
Revisit only if the state machine outgrows ~500 LOC or the re-entry topology becomes non-trivial.

Why a single orchestrator

If we build N agents that all do "PRD-to-PR" with slight variations, we get N versions of the spec format, N reviewer-feedback loops, N dossier formats. The vision rule "humans define what" implies a single, well-defined interface — many orchestrators violate that.

Darwin is the chosen orchestrator because:

It already owns the agent-service runtime.
It already has the launchd job slots for long-lived loops.
It already brokers the Telegram interface via interface-agent.
It already has the sqlite state DB the throughput logger writes to.

What Darwin needs from this section

The pages in this section define the contracts Darwin's orchestrator must implement:

Concern	Defined in
What stages exist, what they emit	prd-to-pr-pipeline.md
Which agent runs which stage	agent-roles-and-model-routing.md
How models are chosen	model-and-vendor-agnosticism.md
What "spec" means	spec-as-contract.md
What "evidence" means	verification-and-evidence.md
When to re-enter the editor	reviewer-feedback-loop.md
How to ramp / roll-back after merge	scale-or-kill.md
What budget constraints apply	throughput-and-business-signals.md

Replacing the human CTO

Vision quote:

The long-term bottleneck of Caire should be compute, not headcount.

The CTO replacement happens when:

PRDs go straight to Darwin without going through a human engineering manager.
Routing decisions across the four agent teams (engineering, optimization, marketing, management) come from cpo-cto (the agent), not the human.
Scale-or-kill decisions happen daily without a human deciding "let's ramp this" or "let's kill that".
Budget allocation happens at management-agent cadence (CEO + CPO/CTO) with the human only ratifying at quarterly checkpoints.

Each item is a concrete, testable transition. We're not at any of them yet. The order they should land:

PRD → Darwin → PR (this section's main deliverable). Easiest. Captures the highest-volume work.
Scale-or-kill automation. Next. Removes the daily ramp/roll-back overhead.
Budget allocation by management agents. Third. Requires finance integration + trust calibration.
Cross-team routing by cpo-cto. Last. Requires (1)–(3) to be solid; otherwise the cross-team router is making decisions with bad inputs.

Where the human still shows up

After this all lands:

Writing PRDs.
Reading the dossier screenshot before merge.
Quarterly ratification of cpo-cto's routing decisions.
Strategic direction at the CEO-agent input level.

That's it. Everything else is software. That's the vision.

Cross-references

Vision and mandate.
Darwin Dashboard — what Darwin is today.
Management agents — interface-agent, cpo-cto, ceo.
Engineering agents — the team Darwin orchestrates.
Compound workflow — current nightly approximation of the pipeline.
Wiki as agent substrate (roadmap) — concrete delivery slices.