Back to Blog
Engineering2026-05-118 min

Token economics: what 15 billion Claude tokens actually cost, and what they bought

J

John C. Thomas

Founder, BlueWave Projects

Over one focused sprint of solo SaaS building with Claude Code I processed approximately 15 billion Claude tokens across the home Apple server — three Macs running Claude Code in rotation as one logical development environment. That number does most of its work as a headline. It's also worth pulling apart, because the breakdown tells you a lot about how agent-augmented engineering actually works — and whether it's worth the spend if you're considering the same workflow.

The breakdown

From the logged sessions across the home Apple server:

| Bucket | Tokens | Share |

|---|---|---|

| Cache reads | 11.6B | 77% |

| Cache creation | 378M | 2.5% |

| Output (model-generated) | 38M | 0.25% |

| Direct input (uncached) | 158K | <0.001% |

These totals are aggregated across the home Apple server — three Macs running Claude Code in rotation as a single development environment. 15B is the conservative rolling headline; the real total floats between 15 and 19 billion depending on the active sprint week.

What each bucket actually is

Cache reads (11.6B). This is the agent replaying recent context — the files in the active session, the running todo list, recent tool results. Every time Claude Code makes a tool call (read a file, run a grep, propose an edit) it re-reads the conversational state to stay coherent. Most of those tokens are the *same tokens* served back, billed at the Anthropic cache-read rate.

Cache creation (378M). New context being added to the cache as the session grows. Each new file Read, each new search result, each new system reminder bumps this up.

Output (38M). The actual model-generated tokens: code suggestions, edits, plans, reasoning, summaries. This is the bucket where "Claude wrote the code" literally lives.

Direct input (158K). Tokens that flowed in *without* a cache hit. Almost negligible — under one Pride and Prejudice's worth of text across the whole sprint of nonstop work. This is what most people imagine when they hear "input tokens." It's actually a rounding error.

Rough cost math

Anthropic's pricing for Claude Sonnet today, at Anthropic API rates:

  • Input: $3 / Mtok
  • Cache writes: $3.75 / Mtok (1.25× input)
  • Cache reads: $0.30 / Mtok (10× cheaper than fresh input)
  • Output: $15 / Mtok
  • Plug in the numbers above:

  • 11.6B cache reads × $0.30/Mtok = $3,480
  • 378M cache writes × $3.75/Mtok = $1,418
  • 38M output × $15/Mtok = $570
  • 158K direct input × $3/Mtok ≈ $0.47
  • Total: about $7,000 across the home Apple server over the sprint.

    I'm a Max plan subscriber — most of this is covered by a flat monthly fee. But the underlying compute economics are still worth understanding because they tell you what's possible.

    What that bought

    Concrete deliverables shipped in the same sprint:

  • BlueWave Projects — multi-tenant SaaS, 15+ tables, full auth, signup, billing scaffolding, ~9 ops tools
  • Ikena Portal — production tenant running a real $139K renovation end-to-end
  • Hawaii 3D Map / hawaii-as-code — 384K parcels, 239K buildings, 204K addresses, encoded as TypeScript and rendered as a 3D map
  • Property Brief — subscription product, signup flow, weekly cron, transactional email
  • Aloha Off-Market Network — three-tier subscription product, signup
  • Hawaii Property Lookup — 4-island address autocomplete + parcel card
  • ProBuildCalc iOS — multi-room stitching, photo evidence pinning, time-lapse compare, AI design overlay
  • Marketing site — multi-page Next.js with portfolio + case studies + résumé surfaces
  • Captain résumé page + Hire page — two structured résumés
  • The relaunch of ProtestTracker
  • Roughly 600K lines of code across 10+ active git repos, 580+ commits.

    The conventional alternative — building this with a small engineering team — costs in the neighborhood of $200K per senior engineer per year fully loaded. Even at four engineers running flat-out for two months it's $130K in salary alone, not counting management overhead, hiring time, onboarding, and the inevitable team-shape friction. Tokens are not a tax. They're the cheapest part of the stack by an order of magnitude.

    The operator lesson

    The hiring conversations I'd want to have with someone evaluating this résumé go through one specific door: I have a *visceral* sense of what production AI costs and what it doesn't. I've seen the bill. I've watched the cache-hit rate drop by 12 points overnight when a system prompt was edited wrong. I've reduced p99 by 4 seconds by moving one section above the cache-control marker.

    That's the difference between an engineer who *uses* AI and an engineer who *runs* AI in production. The latter has done the cost-modeling, owns the latency, and knows when to spend $0.05 on a Claude call versus when a deterministic regex would do the job for $0.

    If you're shipping AI features at any scale, that intuition isn't optional. It's also one of the few things you can't fake on a résumé.

    What 15B tokens doesn't tell you

    A few honest disclaimers:

  • Not every token is doing useful work. Cache reads are 77% of the total. They're cheap, but they're also where the wastage lives. A poorly-structured prompt repeats 50K tokens of irrelevant context on every tool call.
  • More tokens ≠ better outcomes. Some of my most productive sessions were 200K-token, four-hour focused builds. Some of my least productive were two-hour wandering sessions that burned 800K tokens and shipped one bug fix.
  • The platform does some of the work. Claude Code's own context management is doing a lot for me. The token volume isn't all my optimization — much of it is Anthropic's tooling.
  • But the headline survives. 15 billion tokens is what a high-tempo solo SaaS sprint looks like in 2026. Anyone hiring AI engineers in a regulated vertical should be calibrated against that number — and asking candidates to show theirs.

    More from BlueWave