Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Episoder(778)

Angie Hugeback - Generating Training Data for Your ML Models - TWiML Talk #6

Angie Hugeback - Generating Training Data for Your ML Models - TWiML Talk #6

My guest this time is Angie Hugeback, who is principal data scientist at Spare5. Spare5 helps customers generate the high-quality labeled training datasets that are so crucial to developing accurate m...

29 Sep 20161h 1min

Joshua Bloom - Machine Learning for the Stars & Productizing AI - TWiML Talk #5

Joshua Bloom - Machine Learning for the Stars & Productizing AI - TWiML Talk #5

My guest this time is Joshua Bloom. Josh is professor of astronomy at the University of California, Berkeley and co-founder and Chief Technology Officer of machine learning startup Wise.io. In this wi...

22 Sep 20161h 28min

Charles Isbell - Interactive AI, Plus Improving ML Education - TWiML Talk #4

Charles Isbell - Interactive AI, Plus Improving ML Education - TWiML Talk #4

My guest this time is Charles Isbell, Jr., Professor and Senior Associate Dean in the College of Computing at Georgia Institute of Technology. Charles and I go back a bit… in fact he’s the first AI re...

10 Sep 20161h 4min

Xavier Amatriain - Engineering Practical Machine Learning Systems - TWiML Talk #3

Xavier Amatriain - Engineering Practical Machine Learning Systems - TWiML Talk #3

My guest this time is Xavier Amatriain. Xavier is a former researcher who went on to lead the machine learning recommendations team at Netflix, and is now the vice president of engineering at Quora, t...

28 Aug 201656min

Siraj Raval - How to Build Confidence as an ML Developer - TWiML Talk #2

Siraj Raval - How to Build Confidence as an ML Developer - TWiML Talk #2

Siraj Raval is a machine learning hacker and teacher whose machine learning for hackers and fresh machine learning youtube series are fun, informative, high energy and practical ways to learn about a ...

21 Aug 201640min

This Week in ML & AI – 8/12/16: Another huge machine learning acquisition + AI in the Olympics

This Week in ML & AI – 8/12/16: Another huge machine learning acquisition + AI in the Olympics

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week we discuss Intel’s latest deep...

15 Aug 201623min

This Week in ML & AI – 8/5/16: Apple Acquires Turi, the DARPA Hacker-Bot Challenge and More

This Week in ML & AI – 8/5/16: Apple Acquires Turi, the DARPA Hacker-Bot Challenge and More

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week we look at Apple’s acquisition...

6 Aug 201624min

Clare Corthell - Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics - TWiML Talk #1

Clare Corthell - Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics - TWiML Talk #1

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. We try something new this week with an i...

31 Jul 201647min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
stopp-verden
popradet
det-store-bildet
fotballpodden-2
dine-penger-pengeradet
rss-gukild-johaug
bt-dokumentar-2
nokon-ma-ga
lydartikler-fra-aftenposten
aftenbla-bla
hanna-de-heldige
rss-dannet-uten-piano
e24-podden
frokostshowet-pa-p5
rss-ness
rss-penger-polser-og-politikk