Jensen Huang: The AI Bottleneck Has Shifted from Chips to Power — Agentic AI Demands 10x More Compute

Industry Insights · May 14, 2026

·May 15, 2026·3 min read

At the 29th Milken Institute Global Conference, Nvidia CEO Jensen Huang made a case that cuts to the heart of where the AI industry is actually headed. His argument: generative AI was the opening act. The real competition — and the real value creation — happens in the age of Agentic AI.

Huang put a striking number on it: compute demand for Agentic AI has grown roughly 1,000% compared to generative AI two years ago. The reason, he explained, is a fundamental shift in what AI is being asked to do. Generative AI handles a single prompt and returns a response. Agents, by contrast, work through a problem iteratively — planning, reasoning, calling external tools, checking their own work, and correcting course — all within a single task. The compute load compounds at every step.

Chips Are No Longer the Constraint

Perhaps the more consequential signal in Huang's remarks was this: the primary bottleneck in AI infrastructure has moved from chips to power. That's a meaningful shift. It means the scarcity is migrating up the stack — away from silicon and toward the grid, the cooling systems, and the real estate that data centers sit on.

This aligns closely with a recent Barclays research note, which found that GPU rack power density has surged from roughly 25 kW per rack in 2020 to 150 kW under the current Blackwell architecture — with projections exceeding 600 kW after Nvidia's Rubin Ultra launch in 2027. At that scale, power supply and thermal management become the long poles in the tent for any data center buildout.

Huang Isn't Talking About the Future. He's Describing the Present.

One detail in Huang's remarks is easy to miss but worth sitting with: he didn't say Agentic AI will arrive. He said it is happening. That's not marketing language — it's a description of current demand.

The most concrete validation is Nvidia's own order book. Enterprises don't commit to large-scale compute purchases on speculation. Earlier this month, Anthropic signed a deal with SpaceX to source more than 300 megawatts of capacity from its Memphis data center — the equivalent of roughly 300,000 H100 GPUs. That infrastructure isn't being provisioned for chatbots. It's being built to run Claude at enterprise scale, across complex, multi-step Agent workflows.

"The demand for agentic compute is expanding AI's reach from the software industry into the $50 trillion physical economy — and this isn't a future event. It's happening now."
— Jensen Huang, Milken Institute Global Conference, May 2026

The 1,000% compute figure isn't just a headline — it has real implications for any business evaluating an AI Agent strategy. Running an Agent is orders of magnitude more expensive than a standard API call. What used to cost a fraction of a cent per query can now run ten times that, depending on task complexity and the number of reasoning steps involved.

That raises a question that the industry hasn't spent nearly enough time on: in which use cases does deploying an AI Agent actually pencil out? Can the productivity gains — fewer headcount hours, faster cycle times, reduced errors — justify the step-change in compute costs?

Goldman Sachs, in its recently published Decoding the Agent Economy report, projects that global AI token demand will grow 24x by 2030. Taken together, these forecasts point to something more significant than a growth story. They signal a fundamental restructuring of cost economics across industries. The businesses that figure out the ROI math early — and build their Agent strategies around it — will be the ones with a durable edge when this market matures.

Sources: Milken Institute Global Conference / Barclays Research / Goldman Sachs Research

Back to AI News