Confronting AI’s Next Big Challenge: Inference Compute

Confronting AI’s Next Big Challenge: Inference Compute

While AI training garners most of the spotlight — and investment — the demands ofAI inferenceare shaping up to be an even bigger challenge. In this episode ofThe New Stack Makers, Sid Sheth, founder and CEO of d-Matrix, argues that inference is anything but one-size-fits-all. Different use cases — from low-cost to high-interactivity or throughput-optimized — require tailored hardware, and existing GPU architectures aren’t built to address all these needs simultaneously.

“The world of inference is going to be truly heterogeneous,” Sheth said, meaning specialized hardware will be required to meet diverse performance profiles. A major bottleneck? The distance between memory and compute. Inference, especially in generative AI and agentic workflows, requires constant memory access, so minimizing the distance data must travel is key to improving performance and reducing cost.

To address this, d-Matrix developed Corsair, a modular platform where memory and compute are vertically stacked — “like pancakes” — enabling faster, more efficient inference. The result is scalable, flexible AI infrastructure purpose-built for inference at scale.

Learn more from The New Stack about inference compute and AI

Scaling AI Inference at the Edge with Distributed PostgreSQL

Deep Infra Is Building an AI Inference Cloud for Developers

Join our community of newsletter subscribers to stay on top of the news and at the top of your game

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(300)

Why MotherDuck refuses to fork DuckDB

Why MotherDuck refuses to fork DuckDB

At a recent MCP developer summit, The New Stack spoke with Till Döhmen, AI lead atMotherDuck, about the company’s growing role in the evolving DuckDB ecosystem. Backed by investors includingTomasz Tun...

27 Mai 27min

JetBrains is selling independence as the rest of AI coding picks sides

JetBrains is selling independence as the rest of AI coding picks sides

JetBrains is positioning itself as the last major independent AI coding-tool vendor in a market increasingly tied to hyperscalers and foundation model labs. Speaking at Google Cloud Next, JetBrains VP...

21 Mai 26min

Why Block handed Goose to the Linux Foundation

Why Block handed Goose to the Linux Foundation

What began as an internal developer tool atBlockhas evolved into a broader open-source initiative with industry backing. Goose, Block’s AI coding agent, followed a path similar to Amazon’s transformat...

15 Mai 19min

Fivetran's CPO: closed data stacks won't survive the agent era

Fivetran's CPO: closed data stacks won't survive the agent era

At Google Cloud Next 2026, Fivetran Chief Product Officer Anjan Kundavaram argued that enterprise data systems are unprepared for the scale of AI-driven analytics. Unlike humans, AI agents can generat...

13 Mai 22min

The new FinOps problem isn't cloud bills

The new FinOps problem isn't cloud bills

At Google Cloud Next 2026, Finout co-founder and CEO Roi Ravhon and Google Cloud FinOps lead Pathik Sharma discussed how FinOps is rapidly evolving for the AI era. Ravhon argued that while cloud FinOp...

12 Mai 28min

How Microsoft is governing thousands of Kubernetes clusters without manual intervention

How Microsoft is governing thousands of Kubernetes clusters without manual intervention

Managing Kubernetes at fleet scale introduces significant complexity, especially as organizations expand from a few clusters to hundreds or thousands across cloud, on-premises, and edge environments. ...

7 Mai 25min

Why long-running AI agents break on HTTP and how Ably is fixing it

Why long-running AI agents break on HTTP and how Ably is fixing it

In this episode ofThe New Stack Makers, Matthew O’Riordan, CEO of Ably, explains how infrastructure originally built for human collaboration is now well-suited for long-running AI agents. While Ably i...

6 Mai 31min

Why the Linux Foundation adopted MCP, with Jim Zemlin and Mazin Gilbert

Why the Linux Foundation adopted MCP, with Jim Zemlin and Mazin Gilbert

Agentic AI is advancing rapidly, with open-source projects racing to keep pace with real-world deployment. To accelerate progress, the Linux Foundation consolidated key technologies—Model Context Prot...

6 Mai 32min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
fotballpodden-2
popradet
stopp-verden
nokon-ma-ga
rss-espen-lee-usensurert
rss-gukild-johaug
lydartikler-fra-aftenposten
det-store-bildet
hanna-de-heldige
rss-ness
dine-penger-pengeradet
aftenbla-bla
rss-dannet-uten-piano
rss-penger-polser-og-politikk
chit-chat-med-helle
e24-podden