Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Today, we're joined by Jonas Geiping, research group leader at Ellis Institute and the Max Planck Institute for Intelligent Systems to discuss his recent paper, “Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach.” This paper proposes a novel language model architecture which uses recurrent depth to enable “thinking in latent space.” We dig into “internal reasoning” versus “verbalized reasoning”—analogous to non-verbalized and verbalized thinking in humans, and discuss how the model searches in latent space to predict the next token and dynamically allocates more compute based on token difficulty. We also explore how the recurrent depth architecture simplifies LLMs, the parallels to diffusion models, the model's performance on reasoning tasks, the challenges of comparing models with varying compute budgets, and architectural advantages such as zero-shot adaptive exits and natural speculative decoding. The complete show notes for this episode can be found at https://twimlai.com/go/723.

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(788)

Why AI Agents Break the GenAI Security Model with Devvret Rishi - #770

Why AI Agents Break the GenAI Security Model with Devvret Rishi - #770

In this episode, Sam talks with Dev Rishi, GM of AI at Rubrik, about what happens when agents move beyond answering questions and start taking action across tools, systems, and business processes. We...

16 Jun 56min

Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769

Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769

As context windows grow into the millions of tokens, many AI practitioners are questioning whether retrieval-augmented generation (RAG) is still necessary. If modern models can ingest entire libraries...

9 Jun 51min

Relational Foundation Models for Enterprise Data with Jure Leskovec - #768

Relational Foundation Models for Enterprise Data with Jure Leskovec - #768

In this episode, Jure Leskovec, co-founder and chief scientist at Kumo and professor of computer science at Stanford, joins us to explore two fronts of his work: AI for science and relational deep lea...

21 Mai 1h 6min

How to Find the Agent Failures Your Evals Miss with Scott Clark - #767

How to Find the Agent Failures Your Evals Miss with Scott Clark - #767

In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Masl...

7 Mai 53min

How to Engineer AI Inference Systems with Philip Kiely - #766

How to Engineer AI Inference Systems with Philip Kiely - #766

In this episode, Philip Kiely, head of AI education at Baseten, joins us to unpack the fast-evolving discipline of inference engineering. We explore why inference has become the stickiest and most cri...

30 Apr 54min

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

In this episode, Rashmi Shetty, senior director of enterprise generative AI platform at Capital One, joins us to explore how the company is designing, deploying, and scaling multi-agent systems in a h...

16 Apr 54min

The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764

The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764

Today, we're joined by Stefano Ermon, associate professor at Stanford University and CEO of Inception Labs to discuss diffusion language models. We dig into how diffusion approaches—traditionally used...

26 Mar 1h 3min

Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763

Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763

In this episode, Sid Pardeshi, co-founder and CTO of Blitzy, joins us to discuss building autonomous development systems able to deliver production-ready software at enterprise scale. Sid contrasts AI...

10 Mar 1h 16min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
fotballpodden-2
forklart
stopp-verden
popradet
nokon-ma-ga
lydartikler-fra-aftenposten
rss-espen-lee-usensurert
det-store-bildet
rss-gukild-johaug
dine-penger-pengeradet
rss-ness
aftenbla-bla
hanna-de-heldige
rss-utenrikskomiteen-med-bogen-og-grasvik
frokostshowet-pa-p5
e24-podden
rss-penger-polser-og-politikk