Throughput and business signals | Agentic Workflow

Two coupled metrics drive the agentic pipeline:

Features per second per token — the orchestrator's efficiency, used by the optimization mathematician to tune model routing.
Business signals (revenue, cost, funnel, cash) — bound how aggressively scale-or-kill ramps. The mathematician chooses model routing inside the cash budget, never above it.

Features per second per token

Definition

features_per_second_per_token = features_shipped / (elapsed_seconds × total_tokens_consumed)

Where:

features_shipped = count of PRs merged that carry a docs/dossiers/ subdirectory and a wiki/plans/ PRD reference. Cosmetic / hygiene PRs don't count.
elapsed_seconds = wall-clock time across the intake + eight pipeline stages, per feature.
total_tokens_consumed = sum of input + output tokens across all model calls in the pipeline, per feature.

The metric is deliberately tiny in absolute value (10⁻⁹ scale). What matters is the trajectory: are we trending toward more features per token over time? Are specific routing changes moving the number?

Where it's logged

Every PR that merges through the pipeline writes a row to .compound-state/agent-service.db (sqlite) on the Mac mini server:

CREATE TABLE pipeline_runs (
  id              INTEGER PRIMARY KEY,
  feature_slug    TEXT,
  spec_path       TEXT,
  pr_number       INTEGER,
  merged_at       TEXT,           -- ISO timestamp
  elapsed_seconds INTEGER,        -- stage 1 → stage 8
  tokens_total    INTEGER,
  cost_usd        REAL,
  features_per_second_per_token REAL,
  per_role_breakdown JSON         -- tokens + cost per role
);

Schema is illustrative — the actual table lives in be-agent-service and evolves with the pipeline.

Who reads it

The optimization mathematician agent (see agents-optimization.md) reads this table on a weekly cadence to propose model-routing rotations. Inputs to its proposal:

Current routing matrix.
Per-role pass-rate, cost-per-merged-PR, and 95th-percentile latency over the prior week.
Public benchmarks: SWE-bench Verified pass@1, polyglot, etc.
Cash budget headroom from the business-signals feed.

Output: a proposed adjustment to model-and-vendor-agnosticism.md's routing matrix, with a justification grounded in the data above. The CPO/CTO management agent ratifies or rejects.

Business signals

The pipeline is not allowed to optimise pure throughput in isolation — it must respect cash. Vision commitment (d): budgets and cash balance are system inputs.

Inputs

Signal	Source	Update cadence
Cash balance	Finance system / bank API	Daily
Monthly burn	Derived from cash balance trajectory	Weekly
Revenue	Stripe / billing system	Daily
Funnel metrics	Product analytics (`shuri-product-analyst` agent)	Daily
Pipeline cost	`.compound-state/agent-service.db` rollup	Real-time per PR

These feeds are read by the orchestrator before any expensive operation. The orchestrator can refuse to spawn an Architect call if the projected pipeline cost would exceed today's budget.

Constraints derived from signals

daily_pipeline_budget_usd = monthly_pipeline_budget × (cash_runway_factor × revenue_growth_factor)

per_feature_budget_usd    = daily_pipeline_budget_usd / expected_features_today

Order of operations:

The CEO management agent sets monthly_pipeline_budget based on cash + revenue.
The CPO/CTO management agent allocates that across pipeline cost, scale-or-kill ramps, and reserved capacity.
The orchestrator queries the residual daily budget before each pipeline run.
The mathematician's routing proposals are constrained: never propose a rotation that would push expected per-feature cost above the budget.

What "tied to business signals" means concretely

If revenue accelerates → cash runway extends → daily pipeline budget grows → mathematician can rotate to higher-quality (= more expensive) routes for Architect / Editor.

If revenue softens → cash runway tightens → daily pipeline budget shrinks → mathematician rotates toward cheaper routes; aggressive scaling pauses; non-essential pipeline branches (nightly research lab campaigns) are throttled.

The point of vision commitment (d) is that this happens without a human deciding "ok, tighten the belt". The signal is the input; the response is mechanical.

Pilot phase: budget enforcement is OFF by default

The runner gates cost-ceiling enforcement behind PIPELINE_BUDGET_ENFORCEMENT (default off). Code at be-agent-service/apps/server/src/pipeline/budget.ts.

Why off in pilot:

One human is the only PRD producer and ratifies every PRD before kickoff — there's no risk of runaway cost from external/automated PRD producers.
Revenue is zero; a revenue-derived daily budget would also be zero. Enforcing it would refuse all work before any feature ships.
The mathematician's routing-proposal cadence (weekly) doesn't yet have throughput data to read — the runner has to produce some merged PRs first.

Flip on (PIPELINE_BUDGET_ENFORCEMENT=on, optional MAX_COST_PER_RUN_USD=...) when:

Multiple humans (or agents) create PRDs concurrently.
Real revenue / cash signals are wired into the orchestrator's pre-flight check.
The mathematician has at least 4–6 weeks of throughput rows to base routing decisions on.

The runner logs budget enforcement ON / budget enforcement OFF (pilot mode) on every run kickoff so operators see the state at a glance.

What this rules out

"We always run Opus for Architect." → No. Architect runs whatever the routing matrix says, and the routing matrix is a function of business signals.
"We always run lab waves at full capacity." → No. Lab capacity scales with budget headroom.
Tracking cost per role in isolation. → No. Cost is rolled up to per-feature, then per-day, then compared to budget.

What this allows

Auto-throttle when cash gets tight, without a human deciding when "tight" begins.
Auto-expand when growth is good, without waiting for a quarterly review to bump the cap.
Per-feature cost ceilings (vetoes the pipeline before it starts if the spec is too ambitious for today's budget).
A single, auditable trail: every routing decision points back to a budget computation that points back to a cash-balance reading.

Cross-references

Vision and mandate — commitment (d).
Model and vendor agnosticism — the routing matrix this controls.
Scale or kill — uses the cash-budget input to gate ramps.
Optimization mathematician — consumes the throughput rows; proposes routing changes.
Management agents — ceo, cpo-cto, cmo-cso — own the budget allocation.