agentsUpdated·Falk Gottlob··updated ·4 min read

Testable Assumptions Tracker Agent

Convert opportunities into testable assumptions. Track validation status weekly. Know which assumptions are holding up your roadmap.

agentsdiscoveryexperimentation
Helpful?

Try it live
See this agent running in the sandbox

Stream a simulated run, inspect the notifications it would send on Slack and email, and see exactly where it sits in the 7-stage PM OS flow. No password required.

The short version

The Assumption Tracker agent converts each prioritized opportunity into three to five testable assumptions and tags each one Not Tested, Testing, Validated, or Invalidated. It runs weekly on Friday at 2 PM, reading the OST, recent interviews, and experiment results. The output is one living document that flags any feature you're about to build on untested assumptions, plus a ranked list of which assumptions are cheapest to test first. I use it to stop teams from shipping four-week builds against assumptions nobody validated. Start by listing five assumptions behind your next sprint commitment.

You're about to build a feature that takes 4 weeks. But you haven't actually tested whether customers want it. You're assuming.

The problem is managing assumptions across a team. Someone tested customer pain in interviews, but you built the feature assuming a different solution. Someone ran a Slack survey that said "add feature X," but you didn't validate that users would actually use feature X at the volume required to move the needle.

The Assumption Tracker agent converts opportunities into testable assumptions, tracks validation status, and flags which assumptions are still holding up your roadmap.

Every Friday, it updates a living document: here's what we're assuming, here's what we've tested, here's what's still uncertain.

How It Works

The agent works with two inputs: opportunities and experiment results.

Assumption generation: Takes a prioritized opportunity like "SMB customers struggle with slow onboarding" and breaks it into testable assumptions:

  • Assumption 1: "SMB customers want faster onboarding" (validated: interviews, support data)
  • Assumption 2: "They'd use an automated setup flow if available" (not tested)
  • Assumption 3: "A 30-minute setup vs. 2-hour setup would reduce churn by 5%+" (not tested)
  • Assumption 4: "Automated setup would take engineering less than 3 weeks" (not tested)

Validation tracking: Each assumption has a status: Not Tested, Testing, Validated, Invalidated. The agent tracks: when was it tested, what was the result, what changed, and what the next test should be.

Risk flagging: If you're building a 4-week feature based on assumptions that haven't been tested, it gets flagged. "You're assuming users will adopt this, but you haven't tested the adoption."

Data Sources and Setup

Prerequisites: Complete the Claude setup guide first. You'll need:

  • OST / Prioritized opportunities: From the opportunity prioritization agent
  • Research documents: Interviews, surveys, user testing results
  • Experiment tracking: Mixpanel experiments, feature flags, A/B tests, or Notion database
  • Engineering estimates: Feature specs and effort estimates

Schedule: Weekly Friday at 2 PM. Also updates when new experiment results come in.

The Claude Prompt

You are tracking assumptions and their validation status.

Here are our prioritized opportunities:
[OPPORTUNITIES LIST]

Here's our research and validation history:
[RESEARCH DATA: interviews, surveys, test results from past 3 months]

Here are current and recent experiments:
[EXPERIMENT DATA: results, holdout groups, metrics, conclusions]

Please analyze and report:

1. **For each top-10 opportunity, break down:**
   - The core problem statement
   - 3-5 testable assumptions (specific and measurable, not vague)
   - For each assumption, current status: Not Tested / Testing / Validated / Invalidated
   - Evidence for each status (what data supports it?)

2. **Assumption Validation Ranking**
   - Which untested assumptions would be quickest to validate?
   - Which assumptions carry the most risk if wrong?
   - Which assumptions matter most for your next sprint?

3. **Experiment Recommendations**
   - For each untested assumption, suggest how to test it
   - Rough estimate: days to run the test? sample size needed?
   - What's the minimum viable validation?

4. **Red Flags** (IMPORTANT)
   - Are you about to build a feature based on untested assumptions?
   - Which assumptions, if invalidated, would kill the whole opportunity?
   - Are there conflicting assumptions between different opportunities?

5. **Assumption Evolution**
   - Which past assumptions were validated? Which were invalidated?
   - Did you learn anything that should change your roadmap?

Format as a tracker: opportunity → assumptions → status → evidence → next test.

What You Get

Instead of building on assumptions:

  • Validated roadmap: You know which features rest on tested vs. speculative assumptions
  • Risk visibility: Red flags on high-risk assumptions before you start building
  • Testing sequencing: Instead of guessing, you know which assumptions to test first and how
  • Learning memory: You track what worked and what didn't so you don't repeat mistakes

Real outcomes:

  • You save weeks by testing assumptions before building (not after)
  • Your features have better adoption because they're validated before launch
  • Your team talks the same language about risk and evidence

For the full agent fleet and scheduling details, see Your AI Agent Fleet.

Share this post

Also on Medium

Full archive →

Keep Reading

Posts you might find interesting based on what you just read.