Direction Dashboard Agent

The short version

The Direction Dashboard agent pulls seven leading indicators every morning at 6 AM, predicts the outcome metrics they imply 4 to 8 weeks ahead, and posts a chart pack to a #direction Slack channel the whole team reads with morning coffee. The seven indicators are eval pass rate, agent quality score, iteration count, design coherence, customer escalation rate, billing dispute rate, and p95/p99 latency. Every Monday the agent runs a Goodhart audit, checking whether each indicator is still correlating with the outcome it's supposed to predict (above 0.7 healthy, 0.5 to 0.7 weakening, below 0.5 dead). The cultural shift is measuring direction at the cadence of the work, not the cadence of outcomes. Start by writing the direction-indicators.yaml registry.

The agent that turns the seven leading indicators into a dashboard the team actually reads

The companion essay Outcome Accountability Is a Luxury Good makes the case for measuring direction at AI-native velocity. This agent is the implementation.

Most product teams know they should track leading indicators. Few do, because compiling them requires pulling from five tools, normalizing the data, and producing a view the team will actually look at. By the time someone has built that view manually, the indicators are already a week stale.

The direction dashboard agent runs every morning. It pulls the seven leading indicators, computes the trends, predicts the outcome metrics 4-8 weeks ahead, and posts a single chart pack to a channel the whole team reads.

The seven leading indicators

Recap from the essay (so the agent's job is concrete):

Eval pass rate over the last 7 days. Are the prompts shipped this week passing the evals written last month?
Agent quality score from sampled outputs. 50 daily samples rated 1-5 by a senior engineer or product specialist.
Iteration count and shipped changes. How many distinct improvements shipped this week.
Design coherence. Do agent outputs match the brief? Sampled and rated.
Customer escalation rate. Per outcome, what fraction got escalated to a human.
Dispute rate on outcome billing. Per outcome, what fraction got disputed.
Latency at p95 and p99. End-to-end response time on the agent surface.

Each predicts an outcome metric on a 4-8 week lag. Together they form the daily measurement layer.

What the agent does

Five jobs.

Pull each indicator from its source system every morning at 6am.
Compute the trends (7-day, 30-day, 90-day) and flag any indicator outside the noise band.
Predict the outcome impact by running the indicator against a simple regression model that's been calibrated against the last 12 weeks of outcome data.
Compose the daily chart pack showing each indicator's current state, trend, and predicted outcome impact.
Post the chart pack to a Slack channel and the company dashboard, with named call-outs for the team that owns each indicator.

The seven components

1. The indicator registry. A YAML file (direction-indicators.yaml) listing each indicator: name, source query, healthy band, drift threshold, the outcome it predicts, the lag in weeks.

- id: eval_pass_rate_7d
  name: "Eval pass rate (7-day)"
  source: "evals_table"
  query: "SELECT AVG(pass) FROM evals WHERE run_date > now() - interval '7 days'"
  healthy_band: [0.85, 1.0]
  drift_threshold: 0.05
  predicts: "customer_csat_4w"  # CSAT 4 weeks out
  lag_weeks: 4

2. The collector. Pulls each indicator from its source. 40 lines of Python plus warehouse credentials.

3. The trend computer. For each indicator, compute 7-day, 30-day, 90-day moving averages. Flag drift when current 7-day deviates from 30-day average by more than the threshold.

4. The outcome predictor. A small calibration model (linear regression or similar) trained on the last 12 weeks of leading indicator + outcome data. For each leading indicator, predict its corresponding outcome 4-8 weeks ahead. Update the calibration weekly.

5. The chart pack composer. Generates one chart per indicator showing 90 days of history, current trend, and predicted outcome line. Plus a summary panel with the headline number for each: green (on plan), yellow (drifting), red (out of band).

6. The Slack post. Daily post at 6am to a #direction channel: chart pack image, summary status, and a one-line "what changed" comment. Pinned post. The team reads it with morning coffee.

7. The weekly Goodhart check. Every Monday, the agent audits whether each leading indicator is still predicting its outcome. If correlation has dropped below 0.5 over the last 4 weeks, flag the indicator for replacement. Goodhart's law is real; the agent watches for it.

The Goodhart audit prompt

The component that prevents leading indicator gaming.

You are auditing whether the leading indicator '${indicator_name}' is still predicting the outcome '${predicted_outcome}'.

Last 12 weeks of data:
${indicator_weekly}: ${indicator_values}
${outcome_weekly}: ${outcome_values}

Compute the correlation between the indicator (lagged ${lag_weeks} weeks) and the outcome.

If correlation > 0.7: indicator is healthy.
If correlation 0.5-0.7: indicator is weakening, watch.
If correlation < 0.5: indicator is no longer predicting the outcome. Likely Goodharted or no longer relevant.

For weakening or dead indicators, suggest a replacement based on what's correlating better with the outcome over the last 12 weeks.

Return JSON.

When the agent flags a dead indicator, the team has a real conversation about replacing it. This is the discipline that keeps direction measurement honest over time.

What this changes about measurement reviews

Without the agent, direction metrics are nominally tracked but rarely reviewed because compilation is too slow. The team falls back to outcomes, which lag by weeks. Decisions slow down to wait for outcome data.

With the agent, direction metrics are a daily reality. The team's morning conversation includes them. Decisions about what to ship next, what to roll back, what to evaluate further get made in the morning standup off the dashboard.

The cultural shift is that the team measures the work at the cadence of the work. Outcomes still matter; they live on the slower cadence in monthly and quarterly reviews. The two layers coexist deliberately.

What to try this week

Build the indicator registry. Just the YAML file. List the seven indicators (or your team's chosen seven), where the data lives, the healthy band, what each predicts.

Most teams discover that two or three of the indicators they "track" don't actually have a query that produces them. The data is conceptually there, scattered across logs or product analytics, but never compiled into one number. Fix that gap before building the agent.

Once the registry is honest, the agent's collector and chart pack generator are a weekend of engineering work.

The full agent blueprint, including the YAML schemas, the calibration model, and the chart pack templates, is at /artifacts/agent-direction-dashboard. The companion essay on direction measurement is at /blog/outcome-accountability-is-a-luxury-good. The handbook chapter on direction metrics is at /handbook/direction-metrics.

Direction Dashboard Agent

The short version

The agent that turns the seven leading indicators into a dashboard the team actually reads

The seven leading indicators

What the agent does

The seven components

The Goodhart audit prompt

What this changes about measurement reviews

What to try this week

Download the artifact

Also on Medium

AI Agents and the Future of Work: A Pixar-Inspired Journey

Many AI Agents Are Actually Workflows or Automations in Disguise

Frequently asked

About the author

Comments (0)

Keep Reading

Outcome Accountability Is a Luxury Good. Measure Direction.

Renewal Risk Agent for Migration Cohorts

Margin Watch Agent

Board Narrative Drafter Agent

Audits, workshops, advisory.

Follow on LinkedIn.

Browse the toolkit.