Video as a Universal Interface for AI Reasoning with Sherry Yang - #676

Video as a Universal Interface for AI Reasoning with Sherry Yang - #676

Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,” which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstrates how video as a medium and generative video as a task exhibit similar properties. This formulation enables video generation models to play a variety of real-world roles as planners, agents, compute engines, and environment simulators. Finally, we explore UniSim, an interactive demo of Sherry's work and a preview of her vision for interacting with AI-generated environments. The complete show notes for this episode can be found at twimlai.com/go/676.

Suosittua kategoriassa Politiikka ja uutiset

rss-ootsa-kuullut-tasta
aikalisa
tervo-halme
ootsa-kuullut-tasta-2
politiikan-puskaradio
politbyroo
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
rss-podme-livebox
rss-vaalirankkurit-podcast
linda-maria
rss-raha-talous-ja-politiikka
rss-hyvaa-huomenta-bryssel
rss-kuka-mina-olen
aihe
the-ulkopolitist
rss-pallo-keskelle-2
rss-lets-talk-about-hair
rss-mina-ukkola
rss-sanna-ukkola-show-verkkouutiset