Direction Metrics for AI-Native Velocity

The companion essay Outcome Accountability Is a Luxury Good makes the case. This chapter is the practice.

The two-layer measurement system

Layer	Cadence	Indicators	Drives
1 (Direction)	Daily / weekly	The seven leading indicators	Day-to-day decisions: what to ship, what to roll back, what to evaluate further
2 (Outcomes)	Monthly / quarterly	NRR, CSAT, NPS, expansion revenue, churn rate, customer satisfaction by cohort	Strategic decisions: do we keep investing, are we pricing right, is the buyer changing

Both layers are deliberate. Both reviewed in different meetings with different audiences.

The seven leading indicators

One section per indicator with definition, source, healthy band, what it predicts, lag, common pitfalls.

Eval pass rate over the last 7 days. Predicts: customer CSAT 4 weeks out.
Agent quality score from sampled outputs. Predicts: NPS on successor 6 weeks out.
Iteration count and shipped changes. Predicts: feature adoption 8 weeks out.
Design coherence. Predicts: customer trust 6 weeks out.
Customer escalation rate. Predicts: churn 12 weeks out.
Dispute rate on outcome billing. Predicts: NRR 8 weeks out.
Latency at p95 and p99. Predicts: retention 6 weeks out.

The Goodhart audit

Quarterly process. For each indicator, plot it against the outcome it predicts with the lag. Compute correlation. If above 0.7, healthy. If 0.5 to 0.7, weakening. If below 0.5, dead. Replace dead indicators.

What this changes about how PMs are measured

Two evaluation layers. PMs evaluated on the quality of their leading indicators (are they well-chosen, are they predicting outcomes, is the team's iteration cadence healthy). Plus the strategic outcome layer on annual or semi-annual cadence.

Outcome accountability moves from monthly to annual. Direction accountability becomes the day-to-day.

What to do this week

Build the indicator registry. YAML file. List the seven indicators (or your team's chosen seven), where the data lives, the healthy band, what each predicts. The agent that operationalizes this is at /blog/agent-direction-dashboard.

Frequently asked

What are direction metrics?+

Leading indicators measured on the cadence of the work itself. For agent products: eval pass rate, agent quality score, iteration count, design coherence, escalation rate, dispute rate, latency. They predict outcome metrics on a 4-8 week lag and drive day-to-day decisions in a way outcomes can't.

Why do you need direction metrics in addition to outcomes?+

Because outcome cycles for AI features are 4-12 weeks (the data doesn't move faster than that). With agent products iterating 10-20 times per week, by the time an outcome attributes back you've shipped 40-80 more changes. Outcome accountability becomes a lagging measurement that can't drive day-to-day decisions. Direction metrics close the gap.

What are the seven leading indicators?+

(1) Eval pass rate over the last 7 days. (2) Agent quality score from sampled outputs. (3) Iteration count and shipped changes. (4) Design coherence (do agent outputs match the brief). (5) Customer escalation rate. (6) Dispute rate on outcome billing. (7) Latency at p95 and p99.

How do you prevent gaming?+

Treat leading indicators as derivatives of outcomes, not substitutes. Every quarter, audit whether each indicator is still correlating with the outcome it was chosen to predict. If correlation drops below 0.5 over 4 weeks, replace the indicator. Goodhart's law is real; the discipline is constant verification.

What is the two-layer measurement system?+

Layer 1 (daily/weekly): the seven leading indicators. Drives day-to-day decisions. Reviewed in standup and weekly outcome cohort review. Layer 2 (monthly/quarterly): customer outcomes (NRR, CSAT, NPS, expansion). Drives strategic decisions. Reviewed in monthly business review and quarterly strategic review. Both deliberate. Most teams use only one.

Direction Metrics for AI-Native Velocity

The two-layer measurement system

The seven leading indicators

The Goodhart audit

What this changes about how PMs are measured

What to do this week

Frequently asked

Related reading

Direction Dashboard Agent

Outcome Accountability Is a Luxury Good. Measure Direction.

The Eval Is The Spec

The Living Changelog

Continuous Discovery Doesn't Scale for AI-Native Products

10 AI Agents I Built That Failed. The Honest Retrospective.

Audit, workshop, or advisory.

Follow on LinkedIn.

Browse the toolkit.