Scaling Agentic Inference Across Heterogeneous Compute with Zain Asgar - #757

Scaling Agentic Inference Across Heterogeneous Compute with Zain Asgar - #757

In this episode, Zain Asgar, co-founder and CEO of Gimlet Labs, joins us to discuss the heterogeneous AI inference across diverse hardware. Zain argues that the current industry standard of running all AI workloads on high-end GPUs is unsustainable for agents, which consume significantly more tokens than traditional LLM applications. We explore Gimlet’s approach to heterogeneous inference, which involves disaggregating workloads across a mix of hardware—from H100s to older GPUs and CPUs—to optimize unit economics without sacrificing performance. We dive into their "three-layer cake" architecture: workload disaggregation, a compilation layer that maps models to specific hardware targets, and a novel system that uses LLMs to autonomously rewrite and optimize compute kernels. Finally, we discuss the complexities of networking in heterogeneous environments, the trade-offs between numerical precision and application accuracy, and the future of hardware-aware scheduling. The complete show notes for this episode can be found at https://twimlai.com/go/757.

Episoder(780)

AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762

AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762

In this episode, Sebastian Raschka, independent LLM researcher and author, joins us to break down how the LLM landscape has changed over the past year and what is likely to matter most in 2026. We dis...

26 Feb 1h 18min

The Evolution of Reasoning in Small Language Models with Yejin Choi - #761

The Evolution of Reasoning in Small Language Models with Yejin Choi - #761

Today, we're joined by Yejin Choi, professor and senior fellow at Stanford University in the Computer Science Department and the Institute for Human-Centered AI (HAI). In this conversation, we explore...

29 Jan 1h 6min

Intelligent Robots in 2026: Are We There Yet? with Nikita Rudin - #760

Intelligent Robots in 2026: Are We There Yet? with Nikita Rudin - #760

Today, we're joined by Nikita Rudin, co-founder and CEO of Flexion Robotics to discuss the gap between current robotic capabilities and what’s required to deploy fully autonomous robots in the real wo...

8 Jan 1h 6min

Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759

Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759

Today, we're joined by Aakanksha Chowdhery, member of technical staff at Reflection, to explore the fundamental shifts required to build true agentic AI. While the industry has largely focused on post...

17 Des 202552min

Why Vision Language Models Ignore What They See with Munawar Hayat - #758

Why Vision Language Models Ignore What They See with Munawar Hayat - #758

In this episode, we’re joined by Munawar Hayat, researcher at Qualcomm AI Research, to discuss a series of papers presented at NeurIPS 2025 focusing on multimodal and generative AI. We dive into the p...

9 Des 202557min

Proactive Agents for the Web with Devi Parikh - #756

Proactive Agents for the Web with Devi Parikh - #756

Today, we're joined by Devi Parikh, co-founder and co-CEO of Yutori, to discuss browser use models and a future where we interact with the web through proactive, autonomous agents. We explore the tech...

19 Nov 202556min

AI Orchestration for Smart Cities and the Enterprise with Robin Braun and Luke Norris - #755

AI Orchestration for Smart Cities and the Enterprise with Robin Braun and Luke Norris - #755

Today, we're joined by Robin Braun, VP of AI business development for hybrid cloud at HPE, and Luke Norris, co-founder and CEO of Kamiwaza, to discuss how AI systems can be used to automate complex wo...

12 Nov 202554min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
stopp-verden
popradet
i-retten
lydartikler-fra-aftenposten
det-store-bildet
rss-gukild-johaug
dine-penger-pengeradet
nokon-ma-ga
fotballpodden-2
rss-ness
hanna-de-heldige
aftenbla-bla
frokostshowet-pa-p5
rss-dannet-uten-piano
rss-penger-polser-og-politikk
unitedno