The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)10 Mars 2025

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the MVoT framework along with its various task environments—maze, mini-behavior, and frozen lake. We explore token discrepancy loss, a technique designed to align language and visual embeddings, ensuring accurate and meaningful visual representations. Additionally, we cover the data collection and training process, reasoning over relative spatial relations between different entities, and dynamic spatial reasoning. Lastly, Chengzu shares insights from experiments with MVoT, focusing on the lessons learned and the potential for applying these models in real-world scenarios like robotics and architectural design. The complete show notes for this episode can be found at https://twimlai.com/go/722.

Upptäck Premium

Prova 14 dagar kostnadsfritt

Skaffa Premium

Avsnitt(782)

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732

Today, we're joined by Sebastian Gehrmann, head of responsible AI in the Office of the CTO at Bloomberg, to discuss AI safety in retrieval-augmented generation (RAG) systems and generative AI in high-...

21 Maj 202557min

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

Today, we're joined by Mahesh Sathiamoorthy, co-founder and CEO of Bespoke Labs, to discuss how reinforcement learning (RL) is reshaping the way we build custom agents on top of foundation models. Mah...

13 Maj 20251h 1min

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730

Today, we're joined by Josh Tobin, member of technical staff at OpenAI, to discuss the company’s approach to building AI agents. We cover OpenAI's three agentic offerings—Deep Research for comprehensi...

6 Maj 20251h 7min

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729

Today, we're joined by Nidhi Rastogi, assistant professor at Rochester Institute of Technology to discuss Cyber Threat Intelligence (CTI), focusing on her recent project CTIBench—a benchmark for evalu...

30 Apr 202556min

Generative Benchmarking with Kelly Hong - #728

In this episode, Kelly Hong, a researcher at Chroma, joins us to discuss "Generative Benchmarking," a novel approach to evaluating retrieval systems, like RAG applications, using synthetic data. Kelly...

23 Apr 202554min

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

In this episode, Emmanuel Ameisen, a research engineer at Anthropic, returns to discuss two recent papers: "Circuit Tracing: Revealing Language Model Computational Graphs" and "On the Biology of a Lar...

14 Apr 20251h 34min

Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726

Today, we're joined by Maohao Shen, PhD student at MIT to discuss his paper, “Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search.” We dig into...

8 Apr 202551min

Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725

Today, we're joined by Drago Anguelov, head of AI foundations at Waymo, for a deep dive into the role of foundation models in autonomous driving. Drago shares how Waymo is leveraging large-scale machi...

31 Mars 20251h 9min

Premium

99 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill

Prova 14 dagar gratis

Premium

129 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill
Ett extra konto

Prova 14 dagar gratis

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Upptäck Premium

Avsnitt(782)

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729

Generative Benchmarking with Kelly Hong - #728

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726

Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725

Allt en och samma app

Noga utvalt innehåll

Fortsätt när du vill

Premium

Premium

Populärt inom Politik & nyheter

Berättelserna och rösterna du älskar att lyssna på