Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

Today, we're joined by Jonas Geiping, research group leader at Ellis Institute and the Max Planck Institute for Intelligent Systems to discuss his recent paper, “Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach.” This paper proposes a novel language model architecture which uses recurrent depth to enable “thinking in latent space.” We dig into “internal reasoning” versus “verbalized reasoning”—analogous to non-verbalized and verbalized thinking in humans, and discuss how the model searches in latent space to predict the next token and dynamically allocates more compute based on token difficulty. We also explore how the recurrent depth architecture simplifies LLMs, the parallels to diffusion models, the model's performance on reasoning tasks, the challenges of comparing models with varying compute budgets, and architectural advantages such as zero-shot adaptive exits and natural speculative decoding. The complete show notes for this episode can be found at https://twimlai.com/go/723.

Episoder(781)

AI for Power & Energy with Laurent Boinot - #683

AI for Power & Energy with Laurent Boinot - #683

Today we're joined by Laurent Boinot, power and utilities lead for the Americas at Microsoft, to discuss the intersection of AI and energy infrastructure. We discuss the many challenges faced by curre...

7 Mai 202449min

Controlling Fusion Reactor Instability with Deep Reinforcement Learning with Aza Jalalvand - #682

Controlling Fusion Reactor Instability with Deep Reinforcement Learning with Aza Jalalvand - #682

Today we're joined by Azarakhsh (Aza) Jalalvand, a research scholar at Princeton University, to discuss his work using deep reinforcement learning to control plasma instabilities in nuclear fusion rea...

29 Apr 202442min

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681

Today we're joined by Kirk Marple, CEO and founder of Graphlit, to explore the emerging paradigm of "GraphRAG," or Graph Retrieval Augmented Generation. In our conversation, Kirk digs into the GraphRA...

22 Apr 202447min

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and explora...

16 Apr 202446min

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

Today we're joined by Peter Hase, a fifth-year PhD student at the University of North Carolina NLP lab. We discuss "scalable oversight", and the importance of developing a deeper understanding of how ...

8 Apr 202449min

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be...

1 Apr 202448min

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677

Today we’re joined by Mido Assran, a research scientist at Meta’s Fundamental AI Research (FAIR). In this conversation, we discuss V-JEPA, a new model being billed as “the next step in Yann LeCun's vi...

25 Mar 202447min

Video as a Universal Interface for AI Reasoning with Sherry Yang - #676

Video as a Universal Interface for AI Reasoning with Sherry Yang - #676

Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World...

18 Mar 202449min

Populært innen Politikk og nyheter

aftenpodden
giver-og-gjengen-vg
lydartikler-fra-aftenposten
forklart
aftenpodden-usa
i-retten
popradet
stopp-verden
det-store-bildet
dine-penger-pengeradet
fotballpodden-2
rss-gukild-johaug
rss-ness
hanna-de-heldige
nokon-ma-ga
aftenbla-bla
e24-podden
bt-dokumentar-2
rss-dannet-uten-piano
frokostshowet-pa-p5