
The short version
This is a composite story of one customer request moving through the three new agents in the fleet. Monday afternoon: a CSM at Acme Corp sends a Slack DM. Ten days later: the feature is live, announced across six marketing channels, with the customer's quote on the landing page. The old loop would have taken 8 weeks. The new one took 10 days. Three agents, four clickable artifacts, and a much shorter argument.
Monday, 4:02pm. The Slack DM.
Maria Teruel is the ops manager at Acme Corp. She runs a team of linguists translating enterprise contracts, investor decks, help articles, and internal training across 14 languages. Her ops problem, every week: when a linguist calls in sick, she has to manually reassign their in-flight tasks. Click, confirm, click, confirm, for 25 to 40 tasks. It takes her 20 minutes, and she does it roughly once a week.
Today she does it twice because two linguists are out. She finishes the second reassignment at 3:58pm, cursing quietly, and pings Acme's CSM at Smartcat:
"Can you please ask product to let us bulk-reassign tasks? I just spent 45 minutes clicking through two linguists' queues. This has to exist somewhere, right?"
The CSM thanks her, tags the message in our #product channel at 4:02pm, and goes back to her week.
4:03pm. Instant Prototype Agent reads the message.
The agent is watching the #product channel for tagged signals. The Slack thread comes in. It ingests the message, looks up Acme's account tier (enterprise), and searches the last 90 days of customer signals for anything semantically similar.
It finds four: two support tickets from different accounts asking for the same thing, one Gong moment where an enterprise prospect specifically asked about bulk reassignment in their last call, and one Salesforce opportunity note from three months ago. The agent bundles all of this into context.
Then it reads the design system. The enterprise/tables/* component library has the bones of what this feature would need: a selectable table, a bulk action toolbar, an action modal. Not exactly, but close. It reads the current TaskTable component in the web-app repo to understand how it's composed.
At 4:05pm it starts generating. Claude Code writes a BulkReassignModal component, a selection state hook, a toolbar extension, and a small audit log helper. 4 minutes 12 seconds later, the code is done.
At 4:09pm:
- A GitHub PR is open:
feature/bulk-reassign-linguist, draft status, description auto-generated with the customer story. - A Vercel preview is deployed: a clickable prototype of the feature with mock data.
- A Linear ticket is filed: TRANS-1847, assigned to me, acceptance criteria drafted.
- A Notion doc exists: customer context, JTBD, linked artifacts.
- A message in #product: "Bulk Reassign \u2014 clickable prototype shipped to Acme CSM in 7m 34s."
The CSM sees the message at 4:11pm, forwards the preview URL to Maria, and writes: "Can you take a look at this and tell us what's missing?"
Monday evening. Maria clicks.
Maria opens the preview URL from her phone on the train home. She selects three tasks, picks a new assignee, and hits Reassign. The feature works. She pokes around for a few minutes, then replies to the CSM at 6:47pm:
"This is close. Two things: (1) need to see the new assignee's current task count before I pick them, because I don't want to load up someone who's already swamped. (2) the audit log should show the original assignee's name, not just their ID. Other than that, this is what I've been asking for."
The CSM forwards Maria's note to me. I read it on Tuesday morning.
Tuesday. The PM work.
This is the part the agents don't replace. I read Maria's two asks, compare them against the generated prototype, and make a call. Ask #1 (show current task load) is clearly right and cheap to add. Ask #2 (show original assignee name) is already in the data model; the prototype just didn't render it.
I update the Linear ticket's acceptance criteria, add two sub-tasks, and ping the eng team's #bulk-reassign channel (which the agent created yesterday). I also update the Notion doc with Maria's feedback so it stays the single source of truth.
Total time for me: about 25 minutes. Most of it was deciding, not writing.
Wednesday. Engineering takes it.
Priya, the engineering lead, opens the PR on Wednesday morning. The prototype code is 80 percent right and 20 percent off: the state management is fine, the component structure is fine, the audit log helper uses an old pattern we're deprecating. She pairs with a mid-level engineer for 3 hours to tighten the edges, refactor the audit log to use the new pattern, add the task-load indicator Maria asked for, wire up the permission check for enterprise-tier accounts only, and write the unit tests the prototype didn't include.
Wednesday 4pm: the PR is ready for real review. Not draft anymore. Three reviewers (PM, eng lead, design lead) sign off by Thursday noon. Merge at 12:17pm.
The 3 hours Priya spent is the entire engineering cost of this feature. The prototype gave them a working scaffold. The customer context and ACs were already in Linear. The tests and polish were the only new work.
Thursday afternoon. Auto Bugfix Agent catches something.
A customer ticket fires at 2:41pm. Different account (Globex), different feature: "Our CSV export is hanging on projects over 10k rows."
Auto Bugfix Agent ingests it. By 2:45pm it has reproduced the issue with a 12k-row fixture and located the bug in src/exports/csv.ts \u2014 a synchronous O(n\u00b2) string concatenation inside a for-loop. By 2:49pm it has written a fix using a streaming writer with 1k-row batches. By 2:52pm it has a regression test covering 10k, 25k, 50k rows. By 2:53pm it opens PR #3182, ready for review.
Daniel, our on-call engineer, reviews it at 3:08pm while eating a sandwich. He suggests one nit (use our existing writeLine helper instead of inventing a new one), waits 90 seconds for the agent to apply the edit, re-runs CI, and merges at 3:14pm.
Globex's export is unblocked before they know they filed a ticket. Total engineering time: 6 minutes.
Not every Zendesk ticket fits the Auto Bugfix Agent. The ones that do, like this one, vanish from the on-call engineer's day.
Friday. The bulk-reassign feature ships to production.
Thursday night's merge triggers a CI build. Friday morning at 10:04 UTC, the feature is live. Feature flag at 100% rollout for enterprise-tier accounts. Self-serve accounts see an upsell tooltip. Audit log is writing to the production database.
Friday, 10:14am. Launch Comms Agent wakes up.
The agent watches for production deploys with a Linear ticket tagged launched. Bulk-reassign's launch fires the webhook. The agent pulls:
- The Linear ticket (TRANS-1847) with the acceptance criteria
- The Notion doc with Maria's feedback and the customer story
- The release notes entry
- A Figma export of the bulk-reassign modal
- The
voice.yamlfile from the repo
It generates six channel drafts in parallel. By 10:18am all six are ready:
- Landing page copy: a new feature section at smartcat.ai/features/bulk-reassign with Maria's quote as the pull-quote.
- LinkedIn post from my account, in the "operator, not marketer" tone my voice rules specify. Scheduled for Tuesday morning.
- Customer email segmented to 247 enterprise ops contacts. Subject line: "Your ops team just got a new superpower."
- In-app banner on the task list for enterprise accounts only, dismissed on first use.
- Public changelog entry for v4.12.3 (covering both Bulk Reassign and the CSV export fix from Thursday).
- X thread narrating the 10-day loop as a case study.
Product Marketing opens each draft, edits, approves. The landing page goes live Monday morning. The email sends Monday at 15:00 local. The LinkedIn post goes out Tuesday.
Total PMM time from draft-ready to published: about 90 minutes. Before the agent, this was a week of writing plus another week of coordination.
Next Monday. Maria's reply.
Maria opens her inbox on Monday morning. The enterprise ops email is there. She clicks through to the demo link. She spends two minutes confirming the production version matches the prototype she tested a week ago, then replies to the CSM:
"You were not joking. This is exactly what we asked for and it's already on our account. I don't think I've ever seen a feature request go from 'I'm complaining on Slack' to 'it's live and you've emailed me about it' in one week. What do you charge for a PM team that does this?"
The CSM forwards the note to me and to PMM. PMM adds the quote to the landing page. The feature's first-week adoption is 84 percent of eligible accounts: every ops lead who got the email used it within five days.
The loop, mapped
One customer request. Three agents. Four humans (the CSM, me, Priya, PMM). Ten days from Slack DM to launched-and-measured feature. Eight weeks of work collapsed into ten days of typing and deciding.
The loop by day:
- Monday: Prototype generated, customer feedback collected.
- Tuesday: PM digests feedback, updates scope.
- Wednesday: Engineering takes it to production quality.
- Thursday: PR reviewed and merged. Auto Bugfix Agent catches a separate customer bug in parallel.
- Friday: Feature deploys. Launch Comms Agent generates the launch kit.
- Next Monday: Launch kit ships. Customer replies. Feature is live, announced, and adopted.
The thing that changed isn't the speed of individual steps. The thing that changed is that the handoffs stopped costing days each. Prototype to customer feedback: minutes, not a week. Feedback to updated scope: the same morning, not the next sprint. Merge to launch: 18 hours, not 10 business days.
Humans still make every decision. They just make them with better raw material, faster.
What's still human
- Deciding whether the request is worth prototyping at all. The agent flags first-time requests; the PM decides.
- Judging the prototype's quality before forwarding to the customer. The CSM looks at it for 30 seconds before sending.
- Engineering review and merge. Always.
- Customer empathy in the feedback loop. The prototype is a starting point for a conversation, not the conversation.
- PMM editing and voice work. The agent writes the first draft; PMM makes it good.
- Business decisions: pricing tier, permission model, deprecation of the old flow.
The humans didn't get replaced. They got amplified. Three agents doing the typing that used to fill their calendar. That's the transformation.
What I learned running this for a quarter
- Context is the whole game. The quality of the agent's output tracks exactly with the quality of the context it can read. Clean
project_map.yamlfiles, good voice config, well-labeled Linear tickets: these compound. - Shadow mode for two weeks, always. Don't let any agent open PRs, send emails, or ping customers until you've shadow-run it for two weeks on your actual data. The first two weeks always surface calibration problems.
- Review is the pressure valve. Every agent in this loop has a human reviewer before anything customer-facing happens. That's non-negotiable. The speed gain is real; the safety gain only works if review stays in the loop.
- Cycle time compounds. Week one the loop was 14 days, because PMM took a while on the first draft. By week eight it was 8 days, because PMM had trained the agent on its own voice through the edit-capture mechanism. By quarter end, new loops settle at 7 to 10 days.
Try it yourself
The three agents in this story:
- Instant Prototype Agent \u2014 Build stage
- Auto Bugfix Agent \u2014 Build stage
- Launch Comms Agent \u2014 Ship stage
Run the live simulation in the Agent Sandbox. Click each one, trigger a run, and then click any of the output pills. The preview modal shows exactly what each artifact looks like, all generated inside the sandbox so you can poke at them without leaving the page.
The 8-week loop was the old ceiling. The 10-day loop is the current one. The floor is still moving.
Build yours.
Also on Medium
Full archive →AI Agents and the Future of Work: A Pixar-Inspired Journey
What product managers can learn about AI agents from how Pixar runs a film team.
Many AI Agents Are Actually Workflows or Automations in Disguise
How to tell agents from workflows from cron jobs, and why it matters for what you ship.
Frequently asked
Is this a real customer story?+
It's a composite. The request, the dates, and the individual steps are drawn from actual patterns I've watched in the last quarter running this stack at Smartcat. The specific characters (Maria at Acme, the exact timestamps) are fictionalized to protect real customer conversations, but the shape of the loop is exactly how it runs in practice.
Which three agents are in the loop?+
Instant Prototype Agent (signal to clickable prototype), Auto Bugfix Agent (customer bug report to reviewable PR), and Launch Comms Agent (shipped feature to launch kit across six marketing channels). Each has its own blueprint; this post is the narrative of them working together.
What was the old loop this replaced?+
Signal lands in Slack or Zendesk. PM triages, writes PRD, argues with eng about scope. Eng writes code over 3 to 6 weeks. QA, release, marketing, launch. Customer sees the result 6 to 10 weeks after asking. In between, the customer's context has shifted and the original problem has either been hacked around or churned into a different shape.
What's the measured time savings?+
The signal-to-shipped cycle fell from 8 weeks (56 days) to 10 days on the composite example here. Across the full portfolio at Smartcat over a quarter, median cycle time dropped about 60 percent. Individual loops range from 5 days (bug fixes on well-tested surfaces) to 20 days (features requiring architectural work).
Does this replace PMs and engineers?+
No. Every decision point still has a human: PM decides whether the request is worth prototyping, eng reviews the generated code, PMM edits the launch copy, CS decides when to send the customer the preview. The agents handle the typing, the cross-referencing, the deployment plumbing. Humans make the judgment calls faster, with better raw material.