Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Visual Generative AI Ecosystem Challenges with Richard Zhang - #656

Visual Generative AI Ecosystem Challenges with Richard Zhang - #656

Today we’re joined by Richard Zhang, senior research scientist at Adobe Research. In our conversation with Richard, we explore the research challenges that arise when regarding visual generative AI fr...

20 Marras 202340min

Deploying Edge and Embedded AI Systems with Heather Gorr - #655

Deploying Edge and Embedded AI Systems with Heather Gorr - #655

Today we’re joined by Heather Gorr, principal MATLAB product marketing manager at MathWorks. In our conversation with Heather, we discuss the deployment of AI models to hardware devices and embedded A...

13 Marras 202338min

AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654

AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654

Today we’re joined by Yoshua Bengio, professor at Université de Montréal. In our conversation with Yoshua, we discuss AI safety and the potentially catastrophic risks of its misuse. Yoshua highlights ...

6 Marras 202348min

Delivering AI Systems in Highly Regulated Environments with Miriam Friedel - #653

Delivering AI Systems in Highly Regulated Environments with Miriam Friedel - #653

Today we’re joined by Miriam Friedel, senior director of ML engineering at Capital One. In our conversation with Miriam, we discuss some of the challenges faced when delivering machine learning tools ...

30 Loka 202344min

Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652

Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652

Today we’re joined by Riley Goodside, staff prompt engineer at Scale AI. In our conversation with Riley, we explore LLM capabilities and limitations, prompt engineering, and the mental models required...

23 Loka 202339min

Multilingual LLMs and the Values Divide in AI with Sara Hooker - #651

Multilingual LLMs and the Values Divide in AI with Sara Hooker - #651

Today we’re joined by Sara Hooker, director at Cohere and head of Cohere For AI, Cohere’s research lab. In our conversation with Sara, we explore some of the challenges with multilingual models like p...

16 Loka 20231h 18min

Scaling Multi-Modal Generative AI with Luke Zettlemoyer - #650

Scaling Multi-Modal Generative AI with Luke Zettlemoyer - #650

Today we’re joined by Luke Zettlemoyer, professor at University of Washington and a research manager at Meta. In our conversation with Luke, we cover multimodal generative AI, the effect of data on mo...

9 Loka 202338min

Pushing Back on AI Hype with Alex Hanna - #649

Pushing Back on AI Hype with Alex Hanna - #649

Today we’re joined by Alex Hanna, the Director of Research at the Distributed AI Research Institute (DAIR). In our conversation with Alex, we discuss the topic of AI hype and the importance of tacklin...

2 Loka 202349min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
linda-maria
io-techin-tekniikkapodcast
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rikosmyytit
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
viela-yksi-sivu
rss-uusi-juttu
rss-aika-ankkuri
rss-kaikki-uusiksi
rss-merja-mahkan-rahat