Stream a simulated run, inspect the notifications it would send on Slack and email, and see exactly where it sits in the 7-stage PM OS flow. No password required.
The short version
The Margin Watch agent computes gross margin per outcome daily for every SKU and customer cohort, classifies what's compressing margin (token cost, prompt regression, customer mix shift, infrastructure, escalation), and forecasts Jevons cliff risk under three pricing scenarios over the next 6 to 12 months. The CPO digest lands Monday with the three biggest movers and one specific decision to make this week. The cultural shift is pricing reviews stop being post-mortems run by FP&A and become CPO-led decisions with FP&A in support. Compute gross margin per outcome by hand for one SKU last week. If you can't, you have a measurement gap to fix before the agent can run.
The agent that turns gross margin from a quarterly surprise into a daily signal
Most CPOs see gross margin as a financial output reported quarterly by FP&A. By the time the number lands in the board deck, the conditions that produced it (token cost shifts, prompt regressions, customer mix changes, infrastructure overhead) are months old. You're explaining the past instead of steering the future.
The margin watch agent inverts this. Gross margin per outcome is computed daily. Trends are surfaced weekly. Cliff risks (where margin is about to compress quickly because of a foreseeable change) are flagged before they hit the financials.
For Service-as-Software companies, this is the difference between a company that runs the trough deliberately and one that gets surprised by it.
What the watch does
Five jobs.
- Compute gross margin per outcome every day, for every major SKU and customer cohort.
- Track inference cost trends at the prompt level, not just the aggregate.
- Detect margin compression when gross margin per outcome drops outside the noise band, and classify the cause.
- Forecast Jevons cliff risk by modeling what happens to margin under expected inference cost drops over the next 6-12 months.
- Hand the CPO a weekly digest with the three most important margin trends and the recommended actions.
The seven components
1. The cost ledger. A daily table tracking, per outcome: token input cost, token output cost, infrastructure cost (compute, storage, networking), escalation cost (when a human had to step in), QA cost (eval runs against the outcome). Sourced from your model provider invoices, infrastructure billing, CS tooling, and eval logs.
2. The revenue ledger. Daily revenue per outcome, sourced from the billing system. Filtered by SKU and customer cohort.
3. Gross margin per outcome calculator. (Revenue - cost) / revenue, computed daily. Stored with timestamps for trend analysis.
4. The inference cost decomposer. For each major prompt, track input tokens, output tokens, and the model used. When margin compresses, the decomposer surfaces whether the cause is more input tokens (longer customer contexts), more output tokens (more verbose responses), or a model change (routing to a more expensive model for some reason).
5. Margin compression detector. Compares this week's gross margin per outcome to the trailing 8-week trend. If margin drops more than X% week-over-week (default: 3%), classify the cause: token cost (model providers raised prices), prompt change (we routed to a more expensive model or generated longer outputs), customer mix (more high-cost customers), infrastructure (compute or storage spike).
6. Jevons forecaster. A model that takes current margin, current inference cost, and the industry's published cost-decline trajectory (typically 50% per year). Forecasts gross margin per outcome 6 and 12 months out under three scenarios: hold prices (capture margin), pass savings (capture share), partial pass (split the difference). Updates monthly.
7. CPO weekly digest. Every Monday morning, a summary email or Slack post: gross margin per outcome trend, the three biggest movers (customers, prompts, models), the cliff risks for the next quarter, and the one decision the CPO should make this week.
The compression classification prompt
You are diagnosing why gross margin per outcome dropped this week.
This week's margin per outcome: ${current_margin}.
Last 8 weeks' average margin per outcome: ${trend_avg}.
Drop: ${drop_pct}%.
The components:
- Token input cost change: ${input_token_cost_delta}
- Token output cost change: ${output_token_cost_delta}
- Model mix change: ${model_mix_delta}
- Customer mix change: ${customer_mix_delta}
- Infrastructure cost change: ${infra_cost_delta}
- Escalation rate change: ${escalation_delta}
Classify the cause as one of:
- model_provider_pricing: providers raised prices
- prompt_regression: prompts got longer or routed to more expensive models
- customer_mix_shift: cohort of higher-cost customers grew
- infrastructure: compute or storage spike
- escalation_spike: agent quality dropped, more human escalations
- multi_factor: two or more concurrent
Confidence 0-100. One-paragraph diagnostic. One specific recommended action.
Return JSON.
The classification is what lets the CPO act. "Margin dropped 4% this week, classified as prompt regression with 88% confidence; the new prompt for ticket triage routes to a more expensive model. Recommended action: review the routing logic with the agent team this week."
The Jevons cliff specifically
Most pricing-migration discussions stop at the trough. They don't address the second-order question: as inference costs drop 50% per year, what happens to your outcome pricing strategy?
The Jevons forecaster surfaces this. Three scenarios, each updated monthly:
- Hold prices. Margin per outcome rises from 65% (today) to 78% (12 months) to 84% (24 months). Revenue is stable. Customer-perceived value is rising; competitor risk is also rising because someone will undercut you.
- Pass savings (50% of cost decline). Margin per outcome stays around 65-67%. Customer prices drop slightly. Volume grows because customers can afford more. Competitor risk is lower.
- Pass savings (full). Margin per outcome stays around 65%. Customer prices drop significantly. Volume grows substantially. Margin per customer rises through volume even though margin per outcome is flat.
The CPO has to decide. The agent surfaces the data. Most companies are running the "hold prices" scenario by default (because nobody chose otherwise) and getting margin lift but increasing competitive exposure.
What this changes about pricing reviews
Without the agent, the pricing review happens quarterly with FP&A as a participant. The data is FP&A's; the framing is FP&A's. Most CPOs end up nodding through it because the data isn't on their fingertips.
With the agent, the CPO arrives at the pricing review with the margin per outcome trend, the cohort breakdown, the Jevons forecast, and a specific question. The pricing review becomes a CPO-led decision with FP&A in support, not the other way around.
The cultural shift is that pricing stops being a finance function and becomes a product function. Where it should have been all along.
What to try this week
Compute gross margin per outcome by hand for one SKU, last week. Just that. One number for one product, one week of data.
If you can't compute it (data isn't accessible, components aren't tracked), you have a measurement gap that has to be fixed before the agent can run. Fix the gap.
If you can compute it, you'll discover the number is more variable than you think week-to-week. That variability is the agent's value: surfacing it, classifying it, forecasting it.
The full agent blueprint, including the cost ledger schema and the LLM prompt for compression classification, is at /artifacts/agent-margin-watch. The Margin Recovery Curve Model that the agent's forecasts feed into is at /toolkit/margin-recovery-curve-model.
Related
- The PM AI Agent Fleet, the 45-agent operating system this agent slots into.
- The Jevons Cliff in Outcome Pricing, the strategic frame the agent forecasts against.
- Margin Recovery Curve Model, the model the agent calibrates to.
Download the artifact
Ready to use. Copy into your project or share with your team.
Also on Medium
Full archive →AI Agents and the Future of Work: A Pixar-Inspired Journey
What product managers can learn about AI agents from how Pixar runs a film team.
Many AI Agents Are Actually Workflows or Automations in Disguise
How to tell agents from workflows from cron jobs, and why it matters for what you ship.
Frequently asked
What does the margin watch agent do?+
Computes gross margin per outcome every day for every major SKU and customer cohort, surfaces the trends eating margin (token cost, prompt regression, customer mix shift, infrastructure spike, escalation rate), and forecasts Jevons-cliff risk under three pricing scenarios over the next 6-12 months.
What is the Jevons cliff in outcome pricing?+
Inference costs drop ~50% per year. An outcome price set at $0.50/resolution today gets unsustainable margin lift in 12-18 months unless a competitor undercuts you. The cliff is the moment a competitor prices below your cost-anchored level and you have to react: hold price, pass savings, or split.
How does the agent classify margin compression?+
An LLM call takes the week's component deltas (tokens, model mix, customer mix, infrastructure, escalation) and classifies cause as model_provider_pricing, prompt_regression, customer_mix_shift, infrastructure, escalation_spike, or multi_factor, with a one-paragraph diagnostic and a recommended action.
Why daily and not quarterly?+
By the time gross margin compression appears in the quarterly P&L, the conditions that caused it are months old. Daily classification surfaces the cause while it's still actionable. Pricing reviews stop being post-mortems and start being CPO-led decisions with FP&A in support.