Direction Metrics for AI-Native Velocity
Outcomes lag by weeks. Direction moves with each iteration. The seven leading indicators that predict outcomes 4-8 weeks ahead, and the dual-cadence system.
The companion essay Outcome Accountability Is a Luxury Good makes the case. This chapter is the practice.
The two-layer measurement system
| Layer | Cadence | Indicators | Drives |
|---|---|---|---|
| 1 (Direction) | Daily / weekly | The seven leading indicators | Day-to-day decisions: what to ship, what to roll back, what to evaluate further |
| 2 (Outcomes) | Monthly / quarterly | NRR, CSAT, NPS, expansion revenue, churn rate, customer satisfaction by cohort | Strategic decisions: do we keep investing, are we pricing right, is the buyer changing |
Both layers are deliberate. Both reviewed in different meetings with different audiences.
The seven leading indicators
One section per indicator with definition, source, healthy band, what it predicts, lag, common pitfalls.
- Eval pass rate over the last 7 days. Predicts: customer CSAT 4 weeks out.
- Agent quality score from sampled outputs. Predicts: NPS on successor 6 weeks out.
- Iteration count and shipped changes. Predicts: feature adoption 8 weeks out.
- Design coherence. Predicts: customer trust 6 weeks out.
- Customer escalation rate. Predicts: churn 12 weeks out.
- Dispute rate on outcome billing. Predicts: NRR 8 weeks out.
- Latency at p95 and p99. Predicts: retention 6 weeks out.
The Goodhart audit
Quarterly process. For each indicator, plot it against the outcome it predicts with the lag. Compute correlation. If above 0.7, healthy. If 0.5 to 0.7, weakening. If below 0.5, dead. Replace dead indicators.
What this changes about how PMs are measured
Two evaluation layers. PMs evaluated on the quality of their leading indicators (are they well-chosen, are they predicting outcomes, is the team's iteration cadence healthy). Plus the strategic outcome layer on annual or semi-annual cadence.
Outcome accountability moves from monthly to annual. Direction accountability becomes the day-to-day.
What to do this week
Build the indicator registry. YAML file. List the seven indicators (or your team's chosen seven), where the data lives, the healthy band, what each predicts. The agent that operationalizes this is at /blog/agent-direction-dashboard.
Frequently asked
What are direction metrics?+
Leading indicators measured on the cadence of the work itself. For agent products: eval pass rate, agent quality score, iteration count, design coherence, escalation rate, dispute rate, latency. They predict outcome metrics on a 4-8 week lag and drive day-to-day decisions in a way outcomes can't.
Why do you need direction metrics in addition to outcomes?+
Because outcome cycles for AI features are 4-12 weeks (the data doesn't move faster than that). With agent products iterating 10-20 times per week, by the time an outcome attributes back you've shipped 40-80 more changes. Outcome accountability becomes a lagging measurement that can't drive day-to-day decisions. Direction metrics close the gap.
What are the seven leading indicators?+
(1) Eval pass rate over the last 7 days. (2) Agent quality score from sampled outputs. (3) Iteration count and shipped changes. (4) Design coherence (do agent outputs match the brief). (5) Customer escalation rate. (6) Dispute rate on outcome billing. (7) Latency at p95 and p99.
How do you prevent gaming?+
Treat leading indicators as derivatives of outcomes, not substitutes. Every quarter, audit whether each indicator is still correlating with the outcome it was chosen to predict. If correlation drops below 0.5 over 4 weeks, replace the indicator. Goodhart's law is real; the discipline is constant verification.
What is the two-layer measurement system?+
Layer 1 (daily/weekly): the seven leading indicators. Drives day-to-day decisions. Reviewed in standup and weekly outcome cohort review. Layer 2 (monthly/quarterly): customer outcomes (NRR, CSAT, NPS, expansion). Drives strategic decisions. Reviewed in monthly business review and quarterly strategic review. Both deliberate. Most teams use only one.
Related reading
Deeper essays and other handbook chapters on the same thread.
Direction Dashboard Agent
Compiles seven leading indicators every morning, predicts outcomes 4-8 weeks ahead, runs a weekly Goodhart audit. Direction at AI-native velocity.
Outcome Accountability Is a Luxury Good. Measure Direction.
Outcome-driven roadmaps assume 6-12 month measurement cycles. Agents iterate ten times a week. The dual-cadence direction-metric system that closes the gap.
The Eval Is The Spec
Kill the PRD. Ship against a test set. The eval is the contract, the changelog, and the definition of done.
The Living Changelog
Your model vendor changed the model on Tuesday and didn't tell you. Run a daily replay against production or your customers will catch it before you do.
Continuous Discovery Doesn't Scale for AI-Native Products
Teresa Torres' continuous discovery is the right answer for human-centric SaaS and the wrong answer for agent products. The punctuated discovery alternative.
10 AI Agents I Built That Failed. The Honest Retrospective.
Ten AI agents that failed in production. Auto-approve expenses, sentiment classifiers, autonomous pricing, RAG over stale docs, and the lessons that stuck after each.