Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the MVoT framework along with its various task environments—maze, mini-behavior, and frozen lake. We explore token discrepancy loss, a technique designed to align language and visual embeddings, ensuring accurate and meaningful visual representations. Additionally, we cover the data collection and training process, reasoning over relative spatial relations between different entities, and dynamic spatial reasoning. Lastly, Chengzu shares insights from experiments with MVoT, focusing on the lessons learned and the potential for applying these models in real-world scenarios like robotics and architectural design. The complete show notes for this episode can be found at https://twimlai.com/go/722.

Avsnitt(784)

Brendan Frey - Reprogramming the Human Genome with AI - TWiML Talk #12

Brendan Frey - Reprogramming the Human Genome with AI - TWiML Talk #12

My guest this week is Brendan Frey, Professor of Engineering and Medicine at the University of Toronto and Co-Founder and CEO of the startup Deep Genomics. Brendan and I met at the Re-Work Deep Learni...

24 Feb 20171h

Hilary Mason - Building AI Products - TWiML Talk #11

Hilary Mason - Building AI Products - TWiML Talk #11

My guest this time is Hilary Mason. Hilary was one of the first “famous” data scientists. I remember hearing her speak back in 2011 at the Strange Loop conference in St. Louis. At the time she was Chi...

25 Jan 201717min

Francisco Webber - Statistics vs Semantics for Natural Language Processing - TWiML Talk #10

Francisco Webber - Statistics vs Semantics for Natural Language Processing - TWiML Talk #10

My guest this time is Francisco Webber, founder and General Manager of artificial intelligence startup Cortical.io. Francisco presented at the O’Reilly AI conference on an approach to natural language...

3 Dec 201649min

Pascale Fung - Emotional AI: Teaching Computers Empathy - TWiML Talk #9

Pascale Fung - Emotional AI: Teaching Computers Empathy - TWiML Talk #9

My guest this time is Pascale Fung, professor of electrical & computer engineering at Hong Kong University of Science and Technology. Pascale delivered a presentation at the recent O'Reilly AI confere...

8 Nov 201634min

Diogo Almeida - Deep Learning: Modular in Theory, Inflexible in Practice - TWiML Talk #8

Diogo Almeida - Deep Learning: Modular in Theory, Inflexible in Practice - TWiML Talk #8

My guest this time is Diogo Almeida, senior data scientist at healthcare startup Enlitic. Diogo and I met at the O'Reilly AI conference, where he delivered a great presentation on in-the-trenches deep...

23 Okt 201646min

Carlos Guestrin - Explaining the Predictions of Machine Learning Models - TWiML Talk #7

Carlos Guestrin - Explaining the Predictions of Machine Learning Models - TWiML Talk #7

My guest this time is Carlos Guestrin, the Amazon professor of Machine Learning at the University of Washington. Carlos and I recorded this podcast at a conference, shortly after Apple's acquisition o...

9 Okt 201631min

Angie Hugeback - Generating Training Data for Your ML Models - TWiML Talk #6

Angie Hugeback - Generating Training Data for Your ML Models - TWiML Talk #6

My guest this time is Angie Hugeback, who is principal data scientist at Spare5. Spare5 helps customers generate the high-quality labeled training datasets that are so crucial to developing accurate m...

29 Sep 20161h 1min

Joshua Bloom - Machine Learning for the Stars & Productizing AI - TWiML Talk #5

Joshua Bloom - Machine Learning for the Stars & Productizing AI - TWiML Talk #5

My guest this time is Joshua Bloom. Josh is professor of astronomy at the University of California, Berkeley and co-founder and Chief Technology Officer of machine learning startup Wise.io. In this wi...

22 Sep 20161h 28min

Populärt inom Politik & nyheter

aftonbladet-krim
svenska-fall
rss-krimstad
p3-krim
spar
aftonbladet-daily
flashback-forever
rss-sanning-konsekvens
politiken
rss-krimreportrarna
motiv
blenda-2
rss-flodet
rss-frandfors-horna
grans
rss-vad-fan-hande
rss-aftonbladet-krim
dagens-eko
svd-ledarredaktionen
olyckan-inifran