The Emerging Token Gap: How AI Spending Is Splitting Engineers Into Tiers
A new divide is forming in software engineering: how much money flows through your AI tools every day.
On one end, engineers on a $20/month subscription can do real agentic work, but the limits are so tight they’re barely scratching the surface of what’s possible.
On the other end, Karel Doostrlnck, a researcher at OpenAI, recently shared that he spent $10,000 on Codex tokens to automate his research workflows. He orchestrates what he calls “a battalion of agents” that crawl Slack, read documents, analyze data, and write code in parallel. No meetings, no emails, no asking around.
Three tiers are forming
The consumer tier (~$20/month) is where most developers sit today. ChatGPT Plus, Claude Pro, GitHub Copilot. Tools like Claude Code and Codex are available at this price, but the token limits mean you’re rationing usage across the month rather than leaning into it. Enough to get a taste of AI-assisted development, not enough to make it part of how you actually work.
The power user tier (~$100-$200/month) is where the relationship with AI starts to change. Both Anthropic and OpenAI are pricing for this tier: Claude Max at $100/month and $200/month, Codex Pro at $200/month. Engineers at this level run multiple agent sessions in parallel. They describe what they want at a higher level of abstraction, review the output, and spend less time writing code than directing the agents that write it. Delegation, not autocomplete.
The frontier tier ($1,000+/day) is where things get strange. In February 2026, StrongDM’s AI team published their approach to what they call the “Software Factory”: humans write no code and review no code. Their rule of thumb: “If you haven’t spent at least $1,000 on tokens today per human engineer, your software factory has room for improvement.” Their three-person team builds security software using agent swarms, digital clones of services like Okta and Slack for testing, and thousands of automated scenarios per hour. The humans design the specs and scenarios. The agents do everything else.
Similarly, Anthropic researcher Nicholas Carlini used Claude Code’s agent teams feature, which lets multiple AI instances work in parallel on a shared codebase, to build a C compiler from scratch: 16 parallel agents, $20,000 in API costs over two weeks, 100,000 lines of Rust — though the compiler still generates less efficient code than GCC with optimizations disabled.
Personal take
For real, sustained engineering work, I’ve found you need the $100-$200/month tier. That’s where background agents become viable. You can have Claude/Codex working on one task while you think about the next. Agents that don’t just write code but write comprehensive test suites, look for DRY violations, check for performance issues, flag security concerns. That post-task verification work is where a lot of the value lives, and it eats tokens fast.
Even $200/month has limits. Once you want multiple parallel agents each doing heavy research across a codebase, comprehensive testing, or deep optimization passes, you’re looking at API access and costs that climb well beyond any fixed subscription.
What about price drops?
In a recent town hall, Sam Altman stated that OpenAI expects to deliver “GPT-5.2 level intelligence by the end of 2027 for at least 100x less” than current pricing. That would bring output tokens from roughly $14 per million down to $0.14. On the surface, this sounds like it closes the gap.
In practice, it probably won’t. Karel already notes that his token usage goes up with each new model, not because the models are less efficient, but because more capable models unlock workflows that weren’t previously possible. When GPT-5.2 level intelligence costs 100x less, the people currently spending $1,000/day will be running whatever comes next, at today’s frontier prices or higher. Cheaper tokens don’t reduce consumption; they expand what’s worth doing. This pattern has a name: the Jevons Paradox.
Multi-agent tooling is pushing in the same direction, making it easier to spin up parallel AI instances that each carry a full context window. As friction drops, spending goes up.
The trend points one way: more tokens, not fewer.
What engineering teams should consider
Most companies are still spending $101-$500 per developer per year on AI tools. That’s roughly $8-$40 per month. Firmly in the consumer tier. For real-world heavy usage, it’s not enough.
CTOs and heads of engineering should be actively exploring what the $100-$200/month tier unlocks for their teams. Not every engineer needs the same budget. Some teams will benefit from giving senior engineers higher token allowances to experiment with heavier workflows and report back on what produces repeatable, high-quality results. The key word is repeatable. The value of higher spend is only real if it translates into predictable gains. Burning tokens without a methodology is just burning money.
The real edge in AI-assisted engineering might sit beyond even this tier. Boris Cherny, the creator of Claude Code, landed 259 pull requests in a single month with every line written by Claude Code. He’s on API billing, spending roughly $3,000/month. That’s 15x what a $200/month subscription costs, but a fraction of the $1,000/day frontier tier. The output speaks for itself, with strong methodology behind it. This might be closer to where serious engineering teams end up than either the $200/month ceiling or the $30,000/month factory floor. But teams shouldn’t jump straight here. Get maximum value from the $100-$200/month tier first, then ramp up when you know where the extra tokens go and why.
The frontier tier is producing genuinely interesting results, but whether those patterns generalize remains to be seen. What’s less ambiguous is that the consumer tier is already insufficient for serious AI-assisted engineering, and the gap between tiers is widening. Engineering leaders who start exploring the right tier now will be better positioned than those who wait for prices to drop and hope the gap closes on its own.
References
Karel Doostrlnck, “I spent $10,000 to automate my research at OpenAI with Codex”
Boris Cherny on Claude Code and landing 259 PRs in a month
StrongDM AI, “Software Factories and the Agentic Moment”
Simon Willison, “How StrongDM’s AI team build serious software without even looking at the code”
Nicholas Carlini, “Building a C compiler with a team of parallel Claudes”
Sam Altman, OpenAI Town Hall on pricing, safety, and GPT-5
Wikipedia, Jevons Paradox


