d-Matrix - Ultra-low Latency Batched Inference for Gen AI

d-Matrix - Ultra-low Latency Batched Inference for Gen AI

What happens when the real bottleneck in artificial intelligence is no longer training models, but actually running them at scale?

In this episode of Tech Talks Daily, I sit down with Satyam Srivastava from d-Matrix to explore a shift that is quietly reshaping the entire AI infrastructure landscape. While much of the early AI race focused on training ever larger models, the next phase of AI adoption is increasingly defined by inference. That is the moment when trained models are deployed and used to generate real-world results millions of times a day.

Satyam brings a unique perspective shaped by years of experience in signal processing, machine learning, and hardware architecture, including time spent at NVIDIA and Intel working on graphics, media technologies, and AI systems. Now at d-Matrix, he is helping design next-generation computing architectures focused on one of the biggest challenges facing the AI industry today: efficiently running large language models without overwhelming data centers with unsustainable power and infrastructure demands.

During our conversation, we explored why the industry underestimated the infrastructure implications of inference at scale. While training large models grabs headlines, the real operational pressure often comes later when those models must serve millions of queries in real time. That shift places enormous strain on memory bandwidth, energy consumption, and data movement inside modern data centers.

Satyam explains how d-Matrix identified this challenge years before generative AI exploded into the mainstream. Instead of focusing on training hardware like many AI startups at the time, the company concentrated on inference efficiency. That decision is becoming increasingly relevant as organizations begin to realize that simply adding more GPUs to data centers is not a sustainable long-term strategy.

We also discuss the growing power constraints surrounding AI infrastructure, and why efficiency-driven design may be the only realistic path forward. With electricity supply, cooling capacity, and semiconductor availability all becoming limiting factors, the industry is being forced to rethink how AI systems are architected. Custom silicon, purpose-built accelerators, and heterogeneous computing environments are now emerging as key pieces of the puzzle.

The conversation also touches on the geopolitical and economic importance of AI semiconductor leadership, and why the relationship between frontier AI labs, infrastructure providers, and chip designers is becoming increasingly strategic. As governments and companies compete to maintain technological leadership, the question of who controls the hardware powering AI may prove just as important as the models themselves.

Looking ahead, Satyam shares his perspective on how the role of engineers will evolve as AI infrastructure becomes more specialized and energy-aware. Foundational engineering skills remain essential, but the next generation of engineers will also need to think in terms of entire systems, combining software, hardware, and AI tools to build more efficient computing environments.

As AI continues to move from research labs into everyday products and services, are organizations prepared for the infrastructure shift that comes with an inference-driven future? And could efficiency, rather than raw computing power, become the defining metric of the next phase of the AI race?

Episoder(2000)

3474: Mendix CTO on Closing the Gap Between University and Industry

3474: Mendix CTO on Closing the Gap Between University and Industry

There was a time when a computer science degree almost guaranteed a fast track into a well-paid career. But that promise is slipping. In today's Tech Talks Daily episode, I reconnect with Hans de Viss...

3 Nov 202526min

3473: CybExer Technologies on Building the World's First Space Cyber Range

3473: CybExer Technologies on Building the World's First Space Cyber Range

What does cybersecurity look like beyond Earth's atmosphere? That's the question at the heart of this conversation with Kristiina Omri, Vice President of Special Programs at CybExer Technologies, and ...

2 Nov 202535min

3472: How Estonia is Scaling Space Through Software and Partnerships

3472: How Estonia is Scaling Space Through Software and Partnerships

In this episode, I sit down in Tallinn with Madis Võõras, Head of the Estonian Space Office at Enterprise Estonia, to unpack how Estonia is carving out a real role in the European space sector through...

1 Nov 202531min

3471: How Estonia Is Defining the Future of Space and Cyber Defense

3471: How Estonia Is Defining the Future of Space and Cyber Defense

What role does cybersecurity play when the battlefield extends beyond Earth's atmosphere? In this special episode recorded live in Tallinn for the fifth anniversary of the Software Defined Space Confe...

31 Okt 202527min

3470: How Netomi is Bringing Humanity Back to AI-Driven Customer Experience

3470: How Netomi is Bringing Humanity Back to AI-Driven Customer Experience

Artificial intelligence has changed how we think about service, but few companies have bridged the gap between automation and genuine intelligence. In this episode of Tech Talks Daily, I'm joined by P...

30 Okt 202527min

3469: Inside Boston Consulting Group (BCG)'s Global Research on AI at Work

3469: Inside Boston Consulting Group (BCG)'s Global Research on AI at Work

What if the biggest barrier to AI adoption isn't the technology itself, but our ability to learn, adapt, and reskill? That question sits at the heart of my conversation with Sagar Goel, Managing Direc...

29 Okt 202523min

3468 From Upwork to Acquisition: How Eden Data Turned Cybersecurity into a Growth Engine

3468 From Upwork to Acquisition: How Eden Data Turned Cybersecurity into a Growth Engine

What if your cybersecurity strategy could become your biggest sales advantage? In this episode, I sit down with Taylor Hersom, Founder and CEO of Eden Data, to explore how startups can transform compl...

28 Okt 202528min

3467: How Springboard IQ is Helping Startup Founders Rebuild Go-To-Market Strategies

3467: How Springboard IQ is Helping Startup Founders Rebuild Go-To-Market Strategies

What happens when early-stage founders realise their go-to-market strategy just isn't working? Do they double down on outdated advice or take a fresh look at how modern buyers actually engage? In thi...

27 Okt 202528min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
stopp-verden
fotballpodden-2
popradet
nokon-ma-ga
det-store-bildet
rss-gukild-johaug
dine-penger-pengeradet
rss-ness
aftenbla-bla
hanna-de-heldige
lydartikler-fra-aftenposten
rss-utenrikskomiteen-med-bogen-og-grasvik
rss-dannet-uten-piano
rss-penger-polser-og-politikk
chit-chat-med-helle
grasoner-den-nye-kalde-krigen