Agentic Workflow

Scale or kill

Auto-promote what works; auto-kill what regresses. Hands-free ramp via GrowthBook + business signals.

Manual ramp-up and roll-back are repeating tasks. By the vision rule, "if humans repeat a task, the system is wrong." This page describes the mechanism that takes the human out of the scale / kill loop.

The two automations

Scale (auto-promote)

When a feature's post-merge metrics beat its PRD baseline for N consecutive days, the feature gate ramps automatically:

1% → 10% → 25% → 50% → 100% over 5 days, gated by daily metric check

If at any step the metric regresses, the ramp pauses and reverses (see "kill" below).

Kill (auto-rollback)

When a feature's post-merge metrics regress past a threshold for N consecutive days, the feature flag is disabled automatically:

The code stays in main — only the flag flips. Re-enabling is a separate decision that re-enters the PRD pipeline.

Required for this to work

Piece Status
Feature flag service Required — GrowthBook or equivalent. Not yet wired up in this repo.
Per-feature metric definitions Required — each PRD must declare a primary metric and a regression threshold.
Notification path Reuse interface-agent for Telegram on every ramp / roll-back decision.

The PRD template needs two new fields that don't exist today:

primary_metric: "promotion_rate_pct" # what to watch
regression_threshold: -2.0 # rollback if metric drops by ≥2 points
ramp_window_days: 5 # full-rollout cadence

These need to land in SCHEMA.md as part of implementing this page; see the roadmap entry in wiki-as-agent-substrate-2026-04-29.md.

What the agent reads

inputs:
  - flag-state from gate provider (per-feature, current ramp percentage)
  - metrics warehouse: 7-day window for each feature's primary_metric
  - PRD frontmatter: primary_metric, regression_threshold, ramp_window_days
  - cash budget: from finance signal (see business-signals page)

outputs (per feature, per day):
  - decision: ramp_up | hold | hold_with_warning | roll_back
  - reason: machine-readable + human paragraph
  - notification: posted to Telegram if decision != hold

Why this isn't a feature flag service

A feature flag service exposes a flag and lets a human ramp it. This page describes the agent that decides what to do with the service. The two are complementary. We need:

Without the agent, every "should we ramp?" decision falls back on a human; we've automated only the mechanics, not the work.

What "kill" doesn't mean

Kill flips the flag, not the code. The diff stays in main. The reasoning:

If a feature consistently fails and the team decides to remove it, that's a separate cleanup PR — the agent flags this scenario but doesn't auto-merge a code revert.

Failure modes

Cross-references