Playing Planning Poker with Claude
If you're building something for yourself, you skip estimation. You just start coding. Why plan when you already know what you want?
If you're building for someone else, you do estimate. Sort of. You give them the gut number. The one that feels right based on the last time you did something vaguely similar.
And 94% of the time, that gut number is wrong. The project takes 4x longer than you said it would.
I've done this more times than I'll admit. The problem isn't laziness or lack of experience. The problem is that one brain defaults to one perspective. You see the version of the task that makes sense to you, and you miss everything else.
"That is easy" is one of my most common phrases for software development. When I say easy, I am forgetting about the 40 sub tasks that I didn't mentally take into account that come up during the actual development.
What Is Planning Poker?
Planning poker is a technique software teams use to estimate how long work takes. Everyone on the team looks at the same task. Then, without seeing each other's answers, they all hold up a card with their estimate. 1 point. 2 points. 5 points. Whatever they think.
The magic happens when the numbers don't match.
If one person says "1 hour" and another says "8 hours," you don't average them and move on. You stop and talk. The low estimator explains why they think it's simple. The high estimator explains what hidden work they see. That conversation is where teams discover the dependencies, edge cases, and integration work that nobody would have caught alone.
The name comes from the cards. It looks like a poker hand reveal. But the real game is forcing independent thinking. If everyone discussed it first, they'd anchor on the loudest voice in the room. The simultaneous reveal prevents that.
Planning poker works because disagreement surfaces hidden work. When one engineer says "2 hours" and another says "2 days," that gap isn't a mistake. Someone saw a dependency. Someone remembered the last time this "simple change" broke production.
But planning poker requires a team. Most solo founders, indie hackers, and vibe coders don't have one.
So I built a team. Four AI agents inside Claude Code, each with a different personality and estimation bias. They estimate. I estimate. Five voters, five perspectives. I'm not watching from the sidelines. I play too, and I always have the final call.
Meet the Panel
Each agent has a distinct persona that creates a specific estimation bias. They look at the same task independently and vote. Then I vote. When the numbers match, we move fast. When they don't, that's where the real conversation starts.
The Optimist. Sees the happy path. Assumes the docs are clear, the code is clean, and nothing will go wrong. Their job is to define the minimum viable scope. "What if everything goes perfectly? What's the fastest path to shipping this?"
The Skeptic. Hunts edge cases and integration risks. Assumes something will go wrong, because it usually does. Their job is surfacing hidden work. "What are we forgetting? What's the dependency nobody mentioned?"
The Busy Exec. Has 20 minutes between flights. Forces ruthless simplification. If a task can't be explained in one sentence, it's too big. Their job is cutting scope. "Can this be smaller? Do we need all of this right now?"
The Multitasker. Gets interrupted constantly. Knows that a "1 hour task" takes 2 hours in real life because of Slack, email, context switching, and getting back into flow. Their job is accounting for friction. "How long does this take on a normal day with interruptions?"
Five voters. When they converge, I move fast. When they diverge, I slow down and ask why.
The divergence is the entire point.
A Real Example
Task: "Build a landing page for the new campaign."
Simple enough. I would have said "1 hour" and started coding. Instead, I ran it through the panel.
| Voter | Estimate | Reasoning |
|---|---|---|
| Optimist | 1 pt | Clone existing page, swap copy. Done. |
| Skeptic | 4 pts | Need design review, copy, form integration, tracking pixels, mobile testing. |
| Busy Exec | 1 pt | Use the template. Ship it ugly. Iterate later. |
| Multitasker | 2 pts | 1 hour of work plus 1 hour of setup, testing, and getting the copy right. |
| Me | 1 pt | I will just build it in my meeting with Jim |
I put down 1 point. "I'll just build it in my meeting with Jim." Classic gut estimate. No thought about what "build it" actually means.
The Skeptic and the Optimist were 4x apart. That spread told me something I wouldn't have seen on my own.
The Skeptic was right about the form integration. I forgot I needed to wire up the HubSpot form and add UTM tracking. That alone was an extra 30 minutes I wouldn't have budgeted. The Busy Exec was right that I didn't need custom design. Clone the template. Ship it.
Final estimate: 2 points. Honestly, closer to what the Multitasker said all along. One focused task, not a sprawling story, but not the 1-point throwaway I convinced myself it was.
Without the panel, I would have said "1 hour" and spent 3. With the panel, I planned for 2 hours and shipped on time with nothing missing.
How It Works Technically
Each agent is a Claude Code subagent spawned using the Task tool. All four run in parallel, so the estimation round takes about 30 seconds total. Each agent gets:
- The task description
- Their persona definition and bias instructions
- The point scale (1 point = ~1 hour, max 2 points per task)
- Project context so they understand the codebase
They return structured output: task or story, point estimate, and brief reasoning. If any agent says a task is bigger than 2 points, it gets broken into subtasks automatically.
After all four respond, I see the comparison table, add my own vote, and make the call. For items where everyone agrees, I spend zero time discussing. For items where they disagree, I dig in.
The time savings compound. Five backlog items take about 3 minutes to estimate this way. A full sprint's worth of grooming (15-20 items) takes 10 minutes. Compare that to the hour-long grooming meetings I've sat through in my career.
Why the Disagreement Matters
Most solo builders estimate from one perspective. Usually the optimistic one. "I've done something like this before, it took an hour, so this should be about an hour."
That reasoning misses three things:
Hidden dependencies. The Skeptic finds these. "Did you remember you need to update the API endpoint too? And write a migration?"
Scope inflation. The Busy Exec catches this. "You said 'landing page' but you're designing a full marketing funnel. Pick one page and ship it."
Real-world overhead. The Multitasker accounts for this. "Sure, the coding is 1 hour. But you'll spend 30 minutes setting up the dev environment, 20 minutes testing, and another 20 minutes because Slack will interrupt you twice."
One person can't hold all of these perspectives at once. Four agents can. And they do it in 30 seconds.
The Broader Pattern
This isn't limited to estimation. The pattern, spawning multiple AI agents with different biases to create structured disagreement, applies anywhere a single perspective creates blind spots.
Code review. Architecture decisions. Content planning. Pricing strategy. Hiring evaluations.
Any decision where "I'll just go with my gut" has historically led to regret is a candidate for this pattern. Build a panel. Give each agent a clear bias. Let them argue. Then make the call yourself.
The AI doesn't decide. You decide. But you decide with information you wouldn't have had otherwise.
What Vibe Coders Are Missing
The "vibe coding" movement is real and I'm part of it. Build fast, ship fast, iterate fast. But there's a gap in the philosophy. Speed without estimation isn't speed. It's chaos with momentum.
Planning poker with AI agents takes 30 seconds per task. That's not a process burden. That's a sanity check. And the delta between "I think this takes an hour" and "this takes 3 hours in practice" is the difference between shipping on Friday and pushing to next week.
I've run 31 marathons. Every single one required a pace plan. Not because planning is fun, but because running 26.2 miles without a plan means hitting the wall at mile 18. Estimation is your pace plan for building software.
The runners who say "I'll just run by feel" are the ones walking at mile 20. The builders who say "I'll just code by vibes" are the ones scrambling at deadline.
Try It Yourself
The setup is simple. You need Claude Code and about 10 minutes to configure the personas. I've documented the full process, the agent prompts, the point scale, and the execution flow.
The planning poker skill isn't built as a public tool yet (it's on my backlog), but the SOP and agent prompt templates are ready to share. If you want the prompts and the process, follow me on LinkedIn. I'll share them when they're packaged up.
In the meantime, you can start with the core idea: take your next task, describe it to Claude four times with four different persona instructions, and compare the estimates. The spread will surprise you.