d-Matrix - Ultra-low Latency Batched Inference for Gen AI

d-Matrix - Ultra-low Latency Batched Inference for Gen AI

What happens when the real bottleneck in artificial intelligence is no longer training models, but actually running them at scale?

In this episode of Tech Talks Daily, I sit down with Satyam Srivastava from d-Matrix to explore a shift that is quietly reshaping the entire AI infrastructure landscape. While much of the early AI race focused on training ever larger models, the next phase of AI adoption is increasingly defined by inference. That is the moment when trained models are deployed and used to generate real-world results millions of times a day.

Satyam brings a unique perspective shaped by years of experience in signal processing, machine learning, and hardware architecture, including time spent at NVIDIA and Intel working on graphics, media technologies, and AI systems. Now at d-Matrix, he is helping design next-generation computing architectures focused on one of the biggest challenges facing the AI industry today: efficiently running large language models without overwhelming data centers with unsustainable power and infrastructure demands.

During our conversation, we explored why the industry underestimated the infrastructure implications of inference at scale. While training large models grabs headlines, the real operational pressure often comes later when those models must serve millions of queries in real time. That shift places enormous strain on memory bandwidth, energy consumption, and data movement inside modern data centers.

Satyam explains how d-Matrix identified this challenge years before generative AI exploded into the mainstream. Instead of focusing on training hardware like many AI startups at the time, the company concentrated on inference efficiency. That decision is becoming increasingly relevant as organizations begin to realize that simply adding more GPUs to data centers is not a sustainable long-term strategy.

We also discuss the growing power constraints surrounding AI infrastructure, and why efficiency-driven design may be the only realistic path forward. With electricity supply, cooling capacity, and semiconductor availability all becoming limiting factors, the industry is being forced to rethink how AI systems are architected. Custom silicon, purpose-built accelerators, and heterogeneous computing environments are now emerging as key pieces of the puzzle.

The conversation also touches on the geopolitical and economic importance of AI semiconductor leadership, and why the relationship between frontier AI labs, infrastructure providers, and chip designers is becoming increasingly strategic. As governments and companies compete to maintain technological leadership, the question of who controls the hardware powering AI may prove just as important as the models themselves.

Looking ahead, Satyam shares his perspective on how the role of engineers will evolve as AI infrastructure becomes more specialized and energy-aware. Foundational engineering skills remain essential, but the next generation of engineers will also need to think in terms of entire systems, combining software, hardware, and AI tools to build more efficient computing environments.

As AI continues to move from research labs into everyday products and services, are organizations prepared for the infrastructure shift that comes with an inference-driven future? And could efficiency, rather than raw computing power, become the defining metric of the next phase of the AI race?

Episoder(2000)

Invisible Technologies CEO On Building AI Around Real Workflows, Not Hype

Invisible Technologies CEO On Building AI Around Real Workflows, Not Hype

What does it actually take to make AI work inside a real business, where messy data, human judgment, and operational risk all collide? In this episode, I sit down with Matt Fitzpatrick, CEO of Invisib...

13 Apr 29min

Willow On How AI Is Changing The Way Buildings Operate

Willow On How AI Is Changing The Way Buildings Operate

In this episode, I speak with Bert Van Hoof, CEO of Willow, about how AI is starting to reshape the built world in ways that go far beyond smart dashboards and efficiency reports. Bert brings decades ...

12 Apr 48min

Blumberg Capital On What Investors Really Want From AI Founders Now

Blumberg Capital On What Investors Really Want From AI Founders Now

What does it really take to build the next generation of AI companies when the hype around scale begins to fade and real-world impact takes center stage? In this episode, I sit down with David Blumber...

11 Apr 47min

AI Psychosis Explained With Dr. Ragy Girgis From Columbia University

AI Psychosis Explained With Dr. Ragy Girgis From Columbia University

How do we talk about artificial intelligence without ignoring the very human consequences it can have on our mental health? In this episode, I sit down with Dr. Ragy Girgis, Professor of Clinical Psyc...

10 Apr 24min

Flexera: Why 2026 Is AI's 'Back to Basics' Moment

Flexera: Why 2026 Is AI's 'Back to Basics' Moment

Why are so many AI projects failing to deliver real business value, despite the hype and investment? In this episode, I sit down with Jay Litkey, SVP of Cloud & FinOps at Flexera, to explore the growi...

9 Apr 18min

The Lucid Software Playbook For Aligning People, Process, And AI

The Lucid Software Playbook For Aligning People, Process, And AI

How do you bring people together to do better work when everything around them feels increasingly complex, distributed, and uncertain? In today's episode, I sat down with Jessica Guistolise from Lucid...

8 Apr 31min

EvoluteIQ On Rethinking ROI In The Age Of Enterprise AI

EvoluteIQ On Rethinking ROI In The Age Of Enterprise AI

What happens when the very pricing model meant to speed up AI adoption ends up slowing it down? In this episode of Tech Talks Daily, I sit down with Sameet Gupte, CEO and co-founder of EvoluteIQ, to d...

7 Apr 40min

Closing The AI Trust Gap In Customer Experience With Cyara

Closing The AI Trust Gap In Customer Experience With Cyara

How many bad customer experiences does it take before someone walks away for good? In my conversation with Amitha Pulijala, we explore why the answer might be fewer than most businesses are prepared f...

6 Apr 33min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
popradet
stopp-verden
det-store-bildet
fotballpodden-2
nokon-ma-ga
rss-gukild-johaug
dine-penger-pengeradet
hanna-de-heldige
lydartikler-fra-aftenposten
rss-ness
aftenbla-bla
rss-dannet-uten-piano
rss-utenrikskomiteen-med-bogen-og-grasvik
e24-podden
chit-chat-med-helle
rss-penger-polser-og-politikk