d-Matrix - Ultra-low Latency Batched Inference for Gen AI

What happens when the real bottleneck in artificial intelligence is no longer training models, but actually running them at scale?

In this episode of Tech Talks Daily, I sit down with Satyam Srivastava from d-Matrix to explore a shift that is quietly reshaping the entire AI infrastructure landscape. While much of the early AI race focused on training ever larger models, the next phase of AI adoption is increasingly defined by inference. That is the moment when trained models are deployed and used to generate real-world results millions of times a day.

Satyam brings a unique perspective shaped by years of experience in signal processing, machine learning, and hardware architecture, including time spent at NVIDIA and Intel working on graphics, media technologies, and AI systems. Now at d-Matrix, he is helping design next-generation computing architectures focused on one of the biggest challenges facing the AI industry today: efficiently running large language models without overwhelming data centers with unsustainable power and infrastructure demands.

During our conversation, we explored why the industry underestimated the infrastructure implications of inference at scale. While training large models grabs headlines, the real operational pressure often comes later when those models must serve millions of queries in real time. That shift places enormous strain on memory bandwidth, energy consumption, and data movement inside modern data centers.

Satyam explains how d-Matrix identified this challenge years before generative AI exploded into the mainstream. Instead of focusing on training hardware like many AI startups at the time, the company concentrated on inference efficiency. That decision is becoming increasingly relevant as organizations begin to realize that simply adding more GPUs to data centers is not a sustainable long-term strategy.

We also discuss the growing power constraints surrounding AI infrastructure, and why efficiency-driven design may be the only realistic path forward. With electricity supply, cooling capacity, and semiconductor availability all becoming limiting factors, the industry is being forced to rethink how AI systems are architected. Custom silicon, purpose-built accelerators, and heterogeneous computing environments are now emerging as key pieces of the puzzle.

The conversation also touches on the geopolitical and economic importance of AI semiconductor leadership, and why the relationship between frontier AI labs, infrastructure providers, and chip designers is becoming increasingly strategic. As governments and companies compete to maintain technological leadership, the question of who controls the hardware powering AI may prove just as important as the models themselves.

Looking ahead, Satyam shares his perspective on how the role of engineers will evolve as AI infrastructure becomes more specialized and energy-aware. Foundational engineering skills remain essential, but the next generation of engineers will also need to think in terms of entire systems, combining software, hardware, and AI tools to build more efficient computing environments.

As AI continues to move from research labs into everyday products and services, are organizations prepared for the infrastructure shift that comes with an inference-driven future? And could efficiency, rather than raw computing power, become the defining metric of the next phase of the AI race?

Upptäck Premium

Prova 14 dagar kostnadsfritt

Skaffa Premium

Avsnitt(2000)

Dynatrace Intelligence And The Shift From Observability To Autonomous Action

Perform 2026 felt like a turning point for Dynatrace, and when Steve Tack joined me for his fourth appearance on the show, it was clear this was not business as usual. We began with a little Perform ...

15 Feb 23min

Tungsten Automation: Why AI ROI Starts With Boring AI And Real Workflows

What happens when the noise around AI starts to drown out the actual business value it is meant to deliver? In this episode of Tech Talks Daily, I sat down with Adam Field, Chief AI and Product Office...

14 Feb 27min

Agentic AI In Action: How Swan AI Is Rewriting The Rules Of Company Building

How do you build a $30 million ARR business with just three people and a fleet of AI agents doing the heavy lifting? In this episode of Tech Talks Daily, I connected with Amos Joseph, CEO of Swan AI. ...

13 Feb 25min

From Digital Gold To DeFi Liquidity: The Threshold Network Vision For Bitcoin

Is Bitcoin still just a digital store of value, or is it quietly evolving into the financial engine of a new on-chain economy? In this episode of Tech Talks Daily, I sat down with Callan Sarre, Co-Fou...

12 Feb 34min

AI PCs Explained With Logan Lawler from Dell Technologies

What actually happens when AI stops being a cloud-only experiment and starts running on desks, in labs, and inside real teams trying to ship real work? In this episode, I sit down with Logan Lawler, S...

11 Feb 36min

Cisco Live 2026 Amsterdam: Why AI Agents Fail Without Infrastructure Ready For Scale

What does it really take to move AI from experimentation into something enterprises can trust, scale, and rely on every day? In this episode of Tech Talks Daily, I'm joined by Rob Lay, CTO and Solutio...

10 Feb 29min

IBM's Senior Vice President, Americas Consulting on how CEOs Are Rethinking AI ROI

What does it really take to move enterprise AI from impressive demos to decisions that show up in quarterly results? One year into his role as Senior Vice President, Americas Consulting, Neil Dhar sit...

10 Feb 28min

Why EY Thinks Ecosystems Will Define The Future Of Enterprise AI

How Do Marketplaces Turn AI Ambition Into Scalable, Trusted Enterprise Reality? That is the question I explore in this episode with Julie Teigland, Global Vice Chair for Alliances and Ecosystems at EY...

9 Feb 21min

Allt en och samma app

Lyssna på dina favoritpoddar och ljudböcker på ett och samma ställe.

Noga utvalt innehåll

Njut av handplockade tips som passar din smak – utan ändlöst scrollande.

Fortsätt när du vill

Fortsätt lyssna där du slutade – även offline.

Premium

99 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill

Prova 14 dagar gratis

Premium

129 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill
Ett extra konto

Prova 14 dagar gratis

Populärt inom Politik & nyheter

Berättelserna och rösterna du älskar att lyssna på

Obegränsad lyssning på alla dina favoritpoddar och ljudböcker

Upptäck Premium