agentsNew·Falk Gottlob··updated ·6 min read

Direction Dashboard Agent

Compiles seven leading indicators every morning, predicts outcomes 4-8 weeks ahead, runs a weekly Goodhart audit. Direction at AI-native velocity.

agentsdirection metricsleading indicatorsAI velocitymeasurementhow-to
Helpful?

PER-SEAT ARROUTCOME ARRGM TROUGHFALKSTER · CPO EDITIONISSUE 01 · 2026FIG. 00 · COMPOSITE OF THESES 01–03
Try it live
See this agent running in the sandbox

Stream a simulated run, inspect the notifications it would send on Slack and email, and see exactly where it sits in the 7-stage PM OS flow. No password required.

The short version

The Direction Dashboard agent pulls seven leading indicators every morning at 6 AM, predicts the outcome metrics they imply 4 to 8 weeks ahead, and posts a chart pack to a #direction Slack channel the whole team reads with morning coffee. The seven indicators are eval pass rate, agent quality score, iteration count, design coherence, customer escalation rate, billing dispute rate, and p95/p99 latency. Every Monday the agent runs a Goodhart audit, checking whether each indicator is still correlating with the outcome it's supposed to predict (above 0.7 healthy, 0.5 to 0.7 weakening, below 0.5 dead). The cultural shift is measuring direction at the cadence of the work, not the cadence of outcomes. Start by writing the direction-indicators.yaml registry.

The agent that turns the seven leading indicators into a dashboard the team actually reads

The companion essay Outcome Accountability Is a Luxury Good makes the case for measuring direction at AI-native velocity. This agent is the implementation.

Most product teams know they should track leading indicators. Few do, because compiling them requires pulling from five tools, normalizing the data, and producing a view the team will actually look at. By the time someone has built that view manually, the indicators are already a week stale.

The direction dashboard agent runs every morning. It pulls the seven leading indicators, computes the trends, predicts the outcome metrics 4-8 weeks ahead, and posts a single chart pack to a channel the whole team reads.

The seven leading indicators

Recap from the essay (so the agent's job is concrete):

  1. Eval pass rate over the last 7 days. Are the prompts shipped this week passing the evals written last month?
  2. Agent quality score from sampled outputs. 50 daily samples rated 1-5 by a senior engineer or product specialist.
  3. Iteration count and shipped changes. How many distinct improvements shipped this week.
  4. Design coherence. Do agent outputs match the brief? Sampled and rated.
  5. Customer escalation rate. Per outcome, what fraction got escalated to a human.
  6. Dispute rate on outcome billing. Per outcome, what fraction got disputed.
  7. Latency at p95 and p99. End-to-end response time on the agent surface.

Each predicts an outcome metric on a 4-8 week lag. Together they form the daily measurement layer.

What the agent does

Five jobs.

  1. Pull each indicator from its source system every morning at 6am.
  2. Compute the trends (7-day, 30-day, 90-day) and flag any indicator outside the noise band.
  3. Predict the outcome impact by running the indicator against a simple regression model that's been calibrated against the last 12 weeks of outcome data.
  4. Compose the daily chart pack showing each indicator's current state, trend, and predicted outcome impact.
  5. Post the chart pack to a Slack channel and the company dashboard, with named call-outs for the team that owns each indicator.

The seven components

1. The indicator registry. A YAML file (direction-indicators.yaml) listing each indicator: name, source query, healthy band, drift threshold, the outcome it predicts, the lag in weeks.

- id: eval_pass_rate_7d
  name: "Eval pass rate (7-day)"
  source: "evals_table"
  query: "SELECT AVG(pass) FROM evals WHERE run_date > now() - interval '7 days'"
  healthy_band: [0.85, 1.0]
  drift_threshold: 0.05
  predicts: "customer_csat_4w"  # CSAT 4 weeks out
  lag_weeks: 4

2. The collector. Pulls each indicator from its source. 40 lines of Python plus warehouse credentials.

3. The trend computer. For each indicator, compute 7-day, 30-day, 90-day moving averages. Flag drift when current 7-day deviates from 30-day average by more than the threshold.

4. The outcome predictor. A small calibration model (linear regression or similar) trained on the last 12 weeks of leading indicator + outcome data. For each leading indicator, predict its corresponding outcome 4-8 weeks ahead. Update the calibration weekly.

5. The chart pack composer. Generates one chart per indicator showing 90 days of history, current trend, and predicted outcome line. Plus a summary panel with the headline number for each: green (on plan), yellow (drifting), red (out of band).

6. The Slack post. Daily post at 6am to a #direction channel: chart pack image, summary status, and a one-line "what changed" comment. Pinned post. The team reads it with morning coffee.

7. The weekly Goodhart check. Every Monday, the agent audits whether each leading indicator is still predicting its outcome. If correlation has dropped below 0.5 over the last 4 weeks, flag the indicator for replacement. Goodhart's law is real; the agent watches for it.

The Goodhart audit prompt

The component that prevents leading indicator gaming.

You are auditing whether the leading indicator '${indicator_name}' is still predicting the outcome '${predicted_outcome}'.

Last 12 weeks of data:
${indicator_weekly}: ${indicator_values}
${outcome_weekly}: ${outcome_values}

Compute the correlation between the indicator (lagged ${lag_weeks} weeks) and the outcome.

If correlation > 0.7: indicator is healthy.
If correlation 0.5-0.7: indicator is weakening, watch.
If correlation < 0.5: indicator is no longer predicting the outcome. Likely Goodharted or no longer relevant.

For weakening or dead indicators, suggest a replacement based on what's correlating better with the outcome over the last 12 weeks.

Return JSON.

When the agent flags a dead indicator, the team has a real conversation about replacing it. This is the discipline that keeps direction measurement honest over time.

What this changes about measurement reviews

Without the agent, direction metrics are nominally tracked but rarely reviewed because compilation is too slow. The team falls back to outcomes, which lag by weeks. Decisions slow down to wait for outcome data.

With the agent, direction metrics are a daily reality. The team's morning conversation includes them. Decisions about what to ship next, what to roll back, what to evaluate further get made in the morning standup off the dashboard.

The cultural shift is that the team measures the work at the cadence of the work. Outcomes still matter; they live on the slower cadence in monthly and quarterly reviews. The two layers coexist deliberately.

What to try this week

Build the indicator registry. Just the YAML file. List the seven indicators (or your team's chosen seven), where the data lives, the healthy band, what each predicts.

Most teams discover that two or three of the indicators they "track" don't actually have a query that produces them. The data is conceptually there, scattered across logs or product analytics, but never compiled into one number. Fix that gap before building the agent.

Once the registry is honest, the agent's collector and chart pack generator are a weekend of engineering work.


The full agent blueprint, including the YAML schemas, the calibration model, and the chart pack templates, is at /artifacts/agent-direction-dashboard. The companion essay on direction measurement is at /blog/outcome-accountability-is-a-luxury-good. The handbook chapter on direction metrics is at /handbook/direction-metrics.

Share this post

Download the artifact

Ready to use. Copy into your project or share with your team.

Download

Also on Medium

Full archive →

Frequently asked

What does the direction dashboard agent do?+

Pulls the seven leading indicators from their source systems every morning at 6am, computes 7/30/90-day trends, runs each indicator against a calibrated regression model to predict its corresponding outcome 4-8 weeks ahead, and posts a daily chart pack to the team's #direction Slack channel.

What are the seven leading indicators?+

Eval pass rate (7-day), agent quality score from sampled outputs, iteration count and shipped changes, design coherence, customer escalation rate, dispute rate on outcome billing, latency at p95 and p99. Each predicts an outcome metric on a 4-8 week lag.

What is the Goodhart audit?+

A weekly check that asks whether each leading indicator is still correlating with the outcome it was chosen to predict. Correlation > 0.7 healthy, 0.5-0.7 weakening, < 0.5 dead and replaceable. Goodhart's law is real; the agent watches for the team gaming the metric.

Why daily posting and not weekly?+

Direction metrics move with each iteration. With agent products shipping 10-20 times per week, weekly review is too slow to drive day-to-day decisions. The daily dashboard is the operational layer; weekly outcome cohort review is the strategic layer. Both run deliberately.

About the author

Falk Gottlob

Falk Gottlob

Product Executive · Founder, Falkster.AI

Thirty years shipping product at Microsoft Research, Adobe, Salesforce (Marketing Cloud / Quip / Slack), and several startups including one $6.5B exit and one acquired by Microsoft. Now CPO at Smartcat and founder of Falkster.AI, writing this notebook from the boardroom, not the keyboard.

Comments (0)

Sign in with LinkedIn to leave a comment.

Sign in with LinkedIn
  • Be the first to comment.

Keep Reading

Posts you might find interesting based on what you just read.