The AI Boom Is Hitting Infrastructure Limits, Forcing A Shift To In-Platform Deployment

The experimental phase of enterprise AI is over. Instead of treating generative models as a free add‑on for legacy systems, IT leaders are staring down the operational reality of AI data horizons shrinking from yearly to monthly cycles. The faster cadence and increased workloads are forcing organizations to rethink where they run workloads and how much they are willing to pay for the privilege. That math seemingly comes down to two expensive options: buying hardware for internal operations or paying the toll to hyperscalers. Complicating that debate are data sovereignty concerns, worries about ROI, and hybrid cloud strategies that are pushing executive teams to choose a third path: leveraging in‑platform AI within ERP, CRM, and content management systems to avoid running their own training stacks.

Jaspal Dhalliwal is a principal solutions architect in enterprise AI and cloud-based content management, with a background in large-scale IT transformation across industries including healthcare, financial services, and manufacturing. He has held senior architecture and advisory roles at Google, Microsoft, Oracle, and Dell Technologies, where he worked on hyperscaler infrastructure, data platforms, and enterprise cloud strategy at global scale. Having seen the ecosystem from both the hyperscaler core and the enterprise edge, Dhalliwal notes that much of today’s infrastructure strain stems from a market that adopted these models faster than the surrounding tooling and hardware could mature. “AI was launched before the ecosystem was ready, and now the bill for that gap is coming due," he says. "The shine of mass-market AI is fading, and what’s left are the real challenges around economics, scaling, sovereignty, and data governance.”

Leaving the station: For Dhalliwal, the per-token cost numbers enterprises are now seeing are less of a surprise than a delayed invoice for a technology that went to market ahead of its supporting stack. "I think that the AI train, if you wish, was inadvertently launched a little bit before the entire ecosystem was ready. And we're starting to see evidence of that in the scaling of the per-token cost."

One major driver of the current cost pressure is basic path dependency. Much of the industry is still trying to power a new kind of workload with hardware concepts inherited from older problems. Terabit and petabit networks have reduced latency standards for large‑scale AI workloads, so the bottleneck has shifted. Today, the real pain on a hyperscaler bill comes from the memory needed to keep thousands of GPUs fully loaded with model weights and training data. Chipmakers are scrambling to cope, pushing HBM4e memory and specialized inferencing offload chips like Nvidia’s Rubin architecture.

Cloudy with a chance of GPUs: Dhalliwal says a new breed of specialized providers built around Nvidia systems is emerging to absorb heavy inferencing loads that general-purpose infrastructure struggles to support. "We're seeing new providers basically rising up, where effectively they are providing at a very low cost, or at an exceptional level. This may well even be a form of outsourcing of AI, even for hyperscalers who are struggling to build out enough data centers or enough capacity."
Thanks for the memories: Dhalliwal says the scale of hidden data movement behind a single prompt is what catches most IT teams off guard when their bills arrive. "We're typing just a sentence in natural language, but what's happening in the background is basically fanning out. There's a tremendous amount of data shifting around everywhere. The GPUs are pretty quick, but memory pressure on those GPUs is extreme."

While he describes mainstream databases as clunky in conceptual terms, Dhalliwal stresses that the practical question today is how well existing databases and vector engines support specific workloads. Over the next few years, he expects more of the innovation pressure to fall on algorithms and model architectures rather than on endlessly scaling hardware.

Following that, for most enterprises, the smartest play is simple: make the infrastructure someone else's problem. Rather than building bespoke training platforms, Dhalliwal says that organizations are choosing to consume AI natively where their data already lives, such as within an SAP ERP, a Salesforce CRM, or a content management system. Platform vendors are embedding AI into their products, using their own massive scale to shoulder the model‑tuning burden.

The pragmatic pivot: For Dhalliwal, consolidating onto in-platform AI is a precondition for getting the technology into the hands of everyday business users rather than just a specialist team. "Teams themselves will start to see that they need to re-weaponize their AI stack. When we have AI speaking to other systems using protocols like A2A and MCP, the in-platform AI provides me a much, much faster route to democratizing AI across the enterprise."

As organizations make these stack decisions, questions about governance and sovereign control are taking center stage. During the initial AI scramble, some organizations temporarily downplayed traditional security and compliance gates in favor of speed. Now, boards and regulators are paying closer attention to how models are trained, what data they touch, and how their outputs are constrained.

Peacetime protocols: Dhalliwal frames the shift in wartime-to-peacetime terms, noting that the rules set aside during the rush are now returning as board-level priorities. "Governance has been put to the side in this emergency situation around AI, like in war. But eventually, those rules have to come back in again because they protect credibility. How are you protecting my integrity? How are you protecting the firm's reputation and our products, including our AI products?"
Trust issues: For Dhalliwal, that governance conversation is inseparable from the sovereignty one. "For the skeptical among us, this is the sovereignty discussion. I don't believe anybody unless it's in my own sovereign domain. I can see everything. I can verify everything. I can ensure that I'm getting a stable response from AI."

The window for treating AI as a strategic experiment has closed. What remains is a set of decisions that look far more like traditional enterprise tradeoffs: cost versus control, speed versus governance, flexibility versus long-term accountability. The difference is that the stakes are higher and the feedback loop is faster. Every prompt, every workload, every architectural choice now shows up somewhere on the balance sheet or in a risk review.

The question is who can operationalize them without losing control of cost, data, or credibility. As Dhalliwal puts it, “AI only becomes real when the return holds up under scrutiny. Until then, it’s just potential with a price tag.” Enterprises that treat infrastructure as a deliberate choice rather than a default, and align it tightly with where value is actually created, are the ones that will move past experimentation and into durable advantage.

All articles

The AI Boom Is Hitting Infrastructure Limits, Forcing A Shift To In-Platform Deployment

Jaspal Dhalliwal, Cloud Solutions Architect, argues that the check on AI adoption is overdue, as rising costs and infrastructure strain force enterprises to rethink where AI should run.

AI was launched before the ecosystem was ready, and now the bill for that gap is coming due. The shine of mass-market AI is fading, and what’s left are the real challenges around economics, scaling, sovereignty, and data governance.

Jaspal Dhalliwal

Jaspal Dhalliwal

All articles

Enterprise AI

The AI Boom Is Hitting Infrastructure Limits, Forcing A Shift To In-Platform Deployment

Jaspal Dhalliwal, Cloud Solutions Architect, argues that the check on AI adoption is overdue, as rising costs and infrastructure strain force enterprises to rethink where AI should run.

AI was launched before the ecosystem was ready, and now the bill for that gap is coming due. The shine of mass-market AI is fading, and what’s left are the real challenges around economics, scaling, sovereignty, and data governance.

Jaspal Dhalliwal

Jaspal Dhalliwal

Related Stories

While AI Demos May Succeed, Deployments Fail Without Architecture-Led Strategy

Rushing Into AI Leaves Enterprises Paying a Long-Term Complexity Tax

Upfront Requirements Discipline and Leadership Alignment Set the Standard For Enterprise IT Success

As AI Compresses Execution, Value Shifts To Context Engineering And Output Validation

Curing AI Bias Will Require Sacrificing Speed For Verifiable Reasoning