Dylan Patel of SemiAnalysis, in a conversation published by Dwarkesh Patel this week, dropped a claim that reframes the entire compute race: Google sold roughly a million TPUs to Anthropic. DeepMind internally thought the deal was insane — one of the largest chip allocations in cloud history, handed to a direct competitor in the model race.
The Gemini cloud team pushed it through. At the time, their revenue from external AI customers was minimal. A million-TPU contract looked like a win for Google Cloud's balance sheet. Anthropic, meanwhile, was locking in capacity at a moment when most companies were still debating whether frontier AI training would stay GPU-bound or diversify.
What happened next is the part worth stealing
Gemini usage spiked. Google leadership recalibrated on how much internal compute they would need for their own model training and inference. They went to TSMC to request additional fabrication capacity for future TPU generations. TSMC's response: sold out. Every advanced node was already allocated through existing contracts with NVIDIA, Apple, AMD, and others.
Patel's read is blunt: Anthropic understood compute scarcity before Google did — even though Google was the one manufacturing the chips. A procurement timing gap, not a technology gap. And procurement timing gaps tend to be more expensive.
The contract arithmetic for a 30-person IT shop
If you are evaluating Claude versus Gemini for a production workload today, this story changes the vendor-risk calculation in a way that pricing pages do not capture.
Google cannot meaningfully increase its TPU allocation until 2027. That constrains how fast Gemini inference scales, how aggressively Google can cut API prices, and how much spare capacity exists for enterprise SLAs during demand spikes. Google has been buying an energy company, putting deposits on turbines, locking up land and power — all the downstream infrastructure plays. But the upstream silicon constraint is the binding one, and it stays rigid on a quarterly timeline.
Anthropic, by contrast, runs on a diversified silicon stack: Google TPUs (which it already locked in), AWS Trainium, and standard NVIDIA GPUs. That three-supplier position means Anthropic's price trajectory and capacity scaling are less correlated with any single fab's allocation schedule.
An IT buyer building a 2026-2027 cost model for API-heavy workflows — document processing, code generation, agent orchestration — should weight capacity risk alongside per-token price. A vendor that is 15% cheaper today but capacity-constrained through 2027 is a different proposition than one with 15% higher unit cost and multi-source supply.
Since then, Google has been sprinting
Google acquired an energy company. Deposits on power turbines. Locked up land at a pace that resembles a hyperscaler in 2016, not a search company in 2026. The infrastructure build-out signals that Google leadership is no longer underestimating demand. But infrastructure takes years. Power procurement takes years. And the million TPUs they already sold to Anthropic are not coming back.
Patel's framing is that this is a story about timing, not capability. Google builds excellent chips. Google has the capital to out-invest anyone on infrastructure. What Google lacked, in this specific window, was the conviction to hoard its own supply — and a competitor who had that conviction first.
Skip the take, run the test
If your team currently depends on Gemini API for anything latency-sensitive or throughput-dependent, put a capacity test on your Q2 roadmap. Run your peak-load scenario against both Claude and Gemini during a high-traffic window — not a benchmark, an actual production spike. Measure not just response time but error rates and throttling behavior. The million-TPU story suggests Anthropic has headroom that Google may not, and that gap matters more when the invoice arrives during a demand surge than when you are comparing pricing calculators in a spreadsheet.
