Disclaimer: The opinions expressed here are solely my own and not those of any employer, client, or affiliated organisation.

AI tokens and the ‘tokenpocalypse’?

The ‘tokenpocalypse’ isn’t about scary AI bills so much as it is about the absence of sensible AI governance.

Share
poker chips
Photo by Amanda Jones / Unsplash

What is a token anyway?

When people start waving around “tokenpocalypse” headlines, it’s worth pausing to ask what a token actually is. In most language models, a token is just a tiny chunk of text - often around three‑quarters of a word - that the model uses as its basic unit for reading and writing. Your prompt is chopped up into tokens, the model’s response comes back as tokens, and providers quietly track and bill you on how many tokens you send and receive. For many of us, this is largely invisible because we’re sitting on “all‑you‑can‑eat” licences for everyday tools like Microsoft 365 Copilot, where token usage is abstracted behind a flat seat price. But the moment you step into AI systems that aren’t all‑you‑can‑eat, tokens stop being an obscure technical detail and become both the alphabet of AI systems and, increasingly, the currency of AI infrastructure: if you’re not tracking how many tokens your organisation is burning, you’re flying blind on both capability and cost.

And what is a ‘tokenpocalypse’?

Most ‘tokenpocalypse’ scare stories are really symptoms of weak AI governance: organisations have rushed into generative AI without the basic cost, risk, and accountability controls they’d apply to any other strategic technology. To be honest this is just business 101, nothing new! The remedy isn’t to stop using AI, but to deliberately design a new organisational structure around AI - one that treats tokens, models, and agents as governed resources, not an all‑you‑can‑eat buffet.

The token scare is a governance failure

Over the past year, uncontrolled token use has produced some spectacular bill shocks, with reports of enterprises facing AI invoices in the hundreds of millions of dollars and smaller teams burning through seven‑figure budgets in months. Tools that were sold as productivity boosts - chat assistants, coding copilots, agentic workflows - are now under scrutiny because unit prices have fallen but total consumption has exploded, creating a Jevons paradox for AI: cheaper tokens drive more usage faster than costs fall.

Tokenmaxxing has become a kind of corporate sport, where success is measured in how much AI is being used rather than what value it delivers, and budgets have been treated more like marketing experiments than disciplined investments. In this context, of course AI looks expensive: there is little linkage from tokens to tasks to outcomes, no clear accountability for overruns, and almost no systematic way to shut down low‑value workloads. It is worth reading things like this if you want to start thinking about cost control: AI Cost Control Framework for 2026.

What a sensible AI governance framework looks like

A credible AI governance framework starts from the assumption that AI is both a risk and an investment: it needs guardrails for harm and discipline for cost. At minimum, that framework should cover four pillars:

  • Policy and ethics: clear rules about acceptable use, data protection, safety thresholds, and high‑risk applications that require special scrutiny.
  • Architecture and platform: centralised AI platforms or gateways that route traffic, enforce access controls, and give you observability over which models and agents are doing what.
  • FinOps and cost controls: automated token budgets, rate limits, cheaper‑model defaults for simple tasks, and response caching for common queries so you pay once for repeated answers.
  • Value and lifecycle: stage‑gate approvals for new AI initiatives, ROI targets, kill criteria for underperforming systems, and ongoing performance and risk reviews.

The shift from “just give everyone a copilot license” to a governed, platform‑centric model is already underway, with many organisations building internal AI platforms that sit between business users and external models to enforce these controls.

Cost control is a feature, not a constraint

Effective AI cost governance doesn’t mean starving experimentation; it means separating play from production and making each intentional. Teams can give sandboxes and experimentation budgets with looser limits, while production workflows - things that run every day and touch customers or core processes - get stricter budgets, model choices, and escalation paths for overruns.

Modern guidance emphasises a few practical mechanisms:

  • Hard budgets, soft alerts: automated alerts at 70-90% of budget, with circuit‑breakers when use deviates too far from forecasts.
  • Right‑sizing models: defaulting to smaller, cheaper models for routine tasks, reserving frontier models for genuinely complex work.
  • Governance at the seat and workflow level: limiting licenses to verified use cases, tracking token usage by team, and reallocating costs based on actual consumption.
  • Kill criteria: agreeing upfront on thresholds where a project is paused or shut down if value fails to materialise relative to spend.

When these controls are in place, the ‘tokenpocalypse’ looks less like an inevitability and more like an avoidable governance gap - the kind of problem boards and CFOs tackle all the time in other domains.

We’re building a new organisational structure

The deeper story here is organisational: AI forces institutions to re‑architect how authority, responsibility, and resources flow. You cannot bolt AI onto a traditional hierarchy and hope for the best; once models and agents start making decisions, generating content, and interacting with customers, you are effectively adding semi‑autonomous actors into the structure of the organisation.

A robust AI governance framework is therefore not just a compliance checklist - it is the blueprint for a new kind of organisation where:

  • Cross‑functional AI governance boards decide on high‑risk use cases and ethical guardrails.
  • Platform engineering teams own the AI backbone: routing, monitoring, and enforcing cost and risk policies.
  • Business units become accountable for AI value, not just usage, with explicit ROI and risk thresholds.
  • CFOs and boards treat AI spend as a portfolio: diversified, stage‑gated, and subject to maximum exposure limits.

Seen this way, the token scare is an early stress test for that emerging organisational form. Organisations that respond by tightening governance, clarifying accountability, and centralising control will turn AI into a disciplined infrastructure; those that keep chasing usage metrics without structure will stay trapped in the token panic cycle.

© 2002-2026 Kate Carruthers