Two coupled metrics drive the agentic pipeline:
- Features per second per token — the orchestrator's efficiency, used by the optimization mathematician to tune model routing.
- Business signals (revenue, cost, funnel, cash) — bound how aggressively
scale-or-killramps. The mathematician chooses model routing inside the cash budget, never above it.
Features per second per token
Definition
features_per_second_per_token = features_shipped / (elapsed_seconds × total_tokens_consumed)
Where:
features_shipped= count of PRs merged that carry adocs/dossiers/subdirectory and awiki/plans/PRD reference. Cosmetic / hygiene PRs don't count.elapsed_seconds= wall-clock time across the intake + eight pipeline stages, per feature.total_tokens_consumed= sum of input + output tokens across all model calls in the pipeline, per feature.
The metric is deliberately tiny in absolute value (10⁻⁹ scale). What matters is the trajectory: are we trending toward more features per token over time? Are specific routing changes moving the number?
Where it's logged
Every PR that merges through the pipeline writes a row to .compound-state/agent-service.db (sqlite) on the Mac mini server:
CREATE TABLE pipeline_runs (
id INTEGER PRIMARY KEY,
feature_slug TEXT,
spec_path TEXT,
pr_number INTEGER,
merged_at TEXT, -- ISO timestamp
elapsed_seconds INTEGER, -- stage 1 → stage 8
tokens_total INTEGER,
cost_usd REAL,
features_per_second_per_token REAL,
per_role_breakdown JSON -- tokens + cost per role
);
Schema is illustrative — the actual table lives in be-agent-service and evolves with the pipeline.
Who reads it
The optimization mathematician agent (see agents-optimization.md) reads this table on a weekly cadence to propose model-routing rotations. Inputs to its proposal:
- Current routing matrix.
- Per-role pass-rate, cost-per-merged-PR, and 95th-percentile latency over the prior week.
- Public benchmarks: SWE-bench Verified pass@1, polyglot, etc.
- Cash budget headroom from the business-signals feed.
Output: a proposed adjustment to model-and-vendor-agnosticism.md's routing matrix, with a justification grounded in the data above. The CPO/CTO management agent ratifies or rejects.
Business signals
The pipeline is not allowed to optimise pure throughput in isolation — it must respect cash. Vision commitment (d): budgets and cash balance are system inputs.
Inputs
| Signal | Source | Update cadence |
|---|---|---|
| Cash balance | Finance system / bank API | Daily |
| Monthly burn | Derived from cash balance trajectory | Weekly |
| Revenue | Stripe / billing system | Daily |
| Funnel metrics | Product analytics (shuri-product-analyst agent) |
Daily |
| Pipeline cost | .compound-state/agent-service.db rollup |
Real-time per PR |
These feeds are read by the orchestrator before any expensive operation. The orchestrator can refuse to spawn an Architect call if the projected pipeline cost would exceed today's budget.
Constraints derived from signals
daily_pipeline_budget_usd = monthly_pipeline_budget × (cash_runway_factor × revenue_growth_factor)
per_feature_budget_usd = daily_pipeline_budget_usd / expected_features_today
Order of operations:
- The CEO management agent sets
monthly_pipeline_budgetbased on cash + revenue. - The CPO/CTO management agent allocates that across pipeline cost, scale-or-kill ramps, and reserved capacity.
- The orchestrator queries the residual daily budget before each pipeline run.
- The mathematician's routing proposals are constrained: never propose a rotation that would push expected per-feature cost above the budget.
What "tied to business signals" means concretely
If revenue accelerates → cash runway extends → daily pipeline budget grows → mathematician can rotate to higher-quality (= more expensive) routes for Architect / Editor.
If revenue softens → cash runway tightens → daily pipeline budget shrinks → mathematician rotates toward cheaper routes; aggressive scaling pauses; non-essential pipeline branches (nightly research lab campaigns) are throttled.
The point of vision commitment (d) is that this happens without a human deciding "ok, tighten the belt". The signal is the input; the response is mechanical.
Pilot phase: budget enforcement is OFF by default
The runner gates cost-ceiling enforcement behind PIPELINE_BUDGET_ENFORCEMENT (default off). Code at be-agent-service/apps/server/src/pipeline/budget.ts.
Why off in pilot:
- One human is the only PRD producer and ratifies every PRD before kickoff — there's no risk of runaway cost from external/automated PRD producers.
- Revenue is zero; a revenue-derived daily budget would also be zero. Enforcing it would refuse all work before any feature ships.
- The mathematician's routing-proposal cadence (weekly) doesn't yet have throughput data to read — the runner has to produce some merged PRs first.
Flip on (PIPELINE_BUDGET_ENFORCEMENT=on, optional MAX_COST_PER_RUN_USD=...) when:
- Multiple humans (or agents) create PRDs concurrently.
- Real revenue / cash signals are wired into the orchestrator's pre-flight check.
- The mathematician has at least 4–6 weeks of throughput rows to base routing decisions on.
The runner logs budget enforcement ON / budget enforcement OFF (pilot mode) on every run kickoff so operators see the state at a glance.
What this rules out
- "We always run Opus for Architect." → No. Architect runs whatever the routing matrix says, and the routing matrix is a function of business signals.
- "We always run lab waves at full capacity." → No. Lab capacity scales with budget headroom.
- Tracking cost per role in isolation. → No. Cost is rolled up to per-feature, then per-day, then compared to budget.
What this allows
- Auto-throttle when cash gets tight, without a human deciding when "tight" begins.
- Auto-expand when growth is good, without waiting for a quarterly review to bump the cap.
- Per-feature cost ceilings (vetoes the pipeline before it starts if the spec is too ambitious for today's budget).
- A single, auditable trail: every routing decision points back to a budget computation that points back to a cash-balance reading.
Cross-references
- Vision and mandate — commitment (d).
- Model and vendor agnosticism — the routing matrix this controls.
- Scale or kill — uses the cash-budget input to gate ramps.
- Optimization mathematician — consumes the throughput rows; proposes routing changes.
- Management agents —
ceo,cpo-cto,cmo-cso— own the budget allocation.