Appearance
The hyperscaler capex cycle has crossed $600bn — here is the under-discussed line item that decides whether it pays back
The four hyperscaler 10-Qs for Q1 calendar 2026 are now all in. Stack them and the run-rate annualised AI-related capex across Microsoft, Alphabet, Amazon, and Meta clears $612bn for the year, up from $295bn in 2024 — a 107% two-year increase, or roughly the entirety of the global semiconductor capex cycle moved through four corporate income statements. The number is so large it has stopped functioning as a number in most analyst write-ups; it functions as a vibe, against which one is either bull or bear on AI.
I would like to argue the framing should be different. The question that decides whether this capex pays back is not how big it is. It is what fraction of it is being used. Utilisation, specifically inference utilisation, is the one variable that converts the capex into revenue, and it is the variable the buy-side spends the least time modelling.
Here is what we can actually see in the disclosed numbers.
What we mean by utilisation
The simplest definition: of the FLOPs of AI inference capacity that a hyperscaler has installed at any given moment, what fraction is delivering billable tokens? A B200 sitting idle is an expensive paperweight. A B200 running at 92% duty cycle on Bedrock or Vertex inference is, at current pricing, a 40-something percent gross-margin asset. The gap is the whole investment thesis.
No hyperscaler discloses this directly. Three of the four publish heavily-massaged proxies — Azure Cognitive Services revenue per dollar of AI capex deployed, Bedrock token-throughput growth, TPU-hours-billed. Cross-referencing those proxies with deployed-capacity figures from the SemiAnalysis tracker and the TrendForce Q1 2026 hyperscaler GPU census lets us back into an estimate.
Figure 1 — Author's estimate, cross-referenced
Estimated AI inference utilisation (% of installed FLOPs delivering billable workload), Q2-2024 to Q1-2026
Source: Author's model, cross-referenced against SemiAnalysis hyperscaler deployment tracker (Apr 2026), TrendForce GPU census Q1 2026, and disclosed revenue-per-FLOP proxies from MSFT, GOOGL, AMZN, META 10-Qs Q1 calendar 2026. Capacity bars indexed to Q2-2024 = 100.
The shape on the chart is the story. Installed inference capacity has grown roughly 5x in eight quarters, on the bars. Estimated utilisation has fallen from ~78% to ~61% over the same period, on the line — and, critically, the Q1 2026 datapoint is the first quarter where it flattened rather than dropped further.
That flattening matters. It is either the early sign that demand is catching up with the build-out, in which case the capex cycle is rationally sized and the second-derivative is positive. Or it is a temporary pause before the curve resumes downward, in which case the 2026 capex guides are overshooting demand by 15–20%, and the 2027 free-cash-flow profile of the hyperscalers is going to look materially worse than the consensus expects.
Three reasons for the flattening
Pull apart the Q1 2026 quarter and three things are happening at once.
The first is Microsoft Copilot's enterprise inflection. Office 365 Copilot revenue, embedded in the M365 commercial line, is growing faster than the disclosed Azure AI line. The relevant Q1 disclosure: M365 commercial cloud revenue grew 16% YoY, of which roughly 380bps were Copilot-attributable per the earnings call transcript. Copilot's underlying inference is consumed against deployed Azure capacity. As Copilot grows, utilisation on Microsoft's own GPUs grows mechanically.
The second is Bedrock and Vertex catching API demand. Anthropic moved a meaningful chunk of its inference volume onto Bedrock through 2025; Google's Gemini Flash 2.0 and 2.5 dramatically expanded the API-callable surface. AWS disclosed Bedrock-token throughput grew 4.7x YoY in Q1; Google disclosed Vertex AI billable tokens up "more than 5x." Some fraction of that is genuine demand growth and some is share-shift from OpenAI's direct API, but either way it lands on hyperscaler capacity.
The third — and this is the one to watch — is OpenAI's own deployment slowing. OpenAI's compute spend through Microsoft is now running at roughly $19bn/year, capped by Microsoft's capacity-allocation contract. The faster-than-expected build-out of Stargate (Oracle-hosted, SoftBank-financed) is supposed to relieve that constraint by Q3, but the Q1 print suggests OpenAI is being throttled at the margin. The throttle compresses Microsoft's utilisation curve in the short run.
Net of the three: the flattening is partly demand catching up (durable) and partly a constraint on the largest single inference consumer (temporary). Honest read: the curve probably resumes downward by one or two percentage points through Q2 and Q3 before stabilising in the high 50s through year-end.
What it implies for the 2027 capex guides
If utilisation stabilises in the high 50s, the 2027 capex guides need to roll over. The arithmetic is straightforward: at 60% utilisation, every incremental dollar of capex needs to generate $1.85 of revenue to clear a 22% pre-tax IRR (assuming 4-year depreciation, current pricing). At 75% utilisation, the same dollar needs to generate $1.48. The implied revenue ramp at current capex run-rate, at 60% utilisation, is roughly $370bn of new AI revenue by FY28 across the four. That number is plausible; it is not certain.
The under-discussed risk is that the four hyperscalers are individually capacity-constrained ("we cannot sell what we cannot deploy") while being collectively at-or-near capacity-saturated ("the marginal token is now competitively priced because everyone has capacity"). Those two conditions can be true at the same time, and 2026 is the first year where they are both visibly true.
Watch for the language in the Q2 2026 calls. If three of the four hyperscalers shift from "demand exceeds supply" to "we are seeing strong demand at attractive unit economics," the cycle has entered a new phase. That phrasing is the cycle-top tell.
The Concept-Index point
This is a Capex Arms Race result, with an Aggregator Theory overlay. The capex itself is rationalisable only by reference to the customer relationship it locks in — not by reference to the spot economics of any individual GPU. Microsoft is not building $80bn of capacity because B200s clear a great IRR in isolation; it is building $80bn of capacity to make sure that, for the next decade, every enterprise inference workload runs on Azure. The capex is a marker for the aggregator position.
That makes the cycle inherently harder to bend than the bears claim. A rational individual hyperscaler may cut capex; four rational hyperscalers playing a positional game may not, even when the spot economics worsen. The bear case requires one of them to defect first. So far, none has.
— Kairos Thorne, Singapore. 12 May 2026.
