Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(781)

Scaling AI for the Enterprise with Mazin Gilbert - TWiML Talk #78

Scaling AI for the Enterprise with Mazin Gilbert - TWiML Talk #78

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

5 Dec 201749min

Scalable Distributed Deep Learning with Hillery Hunter - TWiML Talk #77

Scalable Distributed Deep Learning with Hillery Hunter - TWiML Talk #77

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

4 Dec 201738min

Robotics at OpenAI with Jonas Schneider - TWiML Talk #76

Robotics at OpenAI with Jonas Schneider - TWiML Talk #76

The show is part of a series that I’m really excited about, in part because I’ve been working to bring them to you for quite a while now. The focus of the series is a sampling of the interesting work ...

1 Dec 201745min

AI Robustness and Safety with Dario Amodei - TWiML Talk #75

AI Robustness and Safety with Dario Amodei - TWiML Talk #75

The show is part of a series that I’m really excited about, in part because I’ve been working to bring them to you for quite a while now. The focus of the series is a sampling of the interesting work ...

30 Nov 201736min

Towards Artificial General Intelligence with Greg Brockman - TWiML Talk #74

Towards Artificial General Intelligence with Greg Brockman - TWiML Talk #74

The show is part of a series that I’m really excited about, in part because I’ve been working to bring them to you for quite a while now. The focus of the series is a sampling of the interesting work ...

28 Nov 201755min

Explaining Black Box Predictions with Sam Ritchie - TWiML Talk #73

Explaining Black Box Predictions with Sam Ritchie - TWiML Talk #73

This week, we’ll be featuring a series of shows recorded from Strange Loop, a great developer-focused conference that takes place every year right in my backyard! The conference is a multi-disciplinar...

25 Nov 201738min

Experimental Creative Writing with the Vectorized Word - Allison Parish - TWIML Talk #72

Experimental Creative Writing with the Vectorized Word - Allison Parish - TWIML Talk #72

This week, we’ll be featuring a series of shows recorded from Strange Loop, a great developer-focused conference that takes place every year right in my backyard! The conference is a multi-disciplinar...

24 Nov 201728min

The Biological Path Towards Strong AI - Matthew Taylor - TWiML Talk #71

The Biological Path Towards Strong AI - Matthew Taylor - TWiML Talk #71

This week, we’ll be featuring a series of shows recorded from Strange Loop, a great developer-focused conference that takes place every year right in my backyard! The conference is a multi-disciplinar...

22 Nov 201737min

Populärt inom Politik & nyheter

svenska-fall
aftonbladet-krim
p3-krim
rss-krimstad
spar
fordomspodden
flashback-forever
rss-sanning-konsekvens
aftonbladet-daily
rss-vad-fan-hande
motiv
rss-expressen-dok
rss-frandfors-horna
rss-krimreportrarna
dagens-eko
politiken
krimmagasinet
rss-flodet
rss-aftonbladet-krim
kungligt