Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Episoder(779)

Geometric Statistics in Machine Learning w/ geomstats with Nina Miolane - TWiML Talk #196

Geometric Statistics in Machine Learning w/ geomstats with Nina Miolane - TWiML Talk #196

In this episode we’re joined by Nina Miolane, researcher and lecturer at Stanford University. Nina and I spoke about her work in the field of geometric statistics in ML, specifically the application o...

1 Nov 201843min

Milestones in Neural Natural Language Processing with Sebastian Ruder - TWiML Talk #195

Milestones in Neural Natural Language Processing with Sebastian Ruder - TWiML Talk #195

In this episode, we’re joined by Sebastian Ruder, PhD student studying NLP at National University of Ireland and Research Scientist at text analysis startup Aylien. We discuss recent milestones in neu...

29 Okt 20181h 1min

Natural Language Processing at StockTwits with Garrett Hoffman - TWiML Talk #194

Natural Language Processing at StockTwits with Garrett Hoffman - TWiML Talk #194

In this episode, we’re joined by Garrett Hoffman, Director of Data Science at Stocktwits. Stocktwits is a social network for the investing community which has its roots in the use of the $cashtag on T...

25 Okt 201850min

Advanced Reinforcement Learning & Data Science for Social Impact with Vukosi Marivate - TWiML Talk #193

Advanced Reinforcement Learning & Data Science for Social Impact with Vukosi Marivate - TWiML Talk #193

In the final episode of our Deep Learning Indaba series, we speak with Vukosi Marivate, Chair of Data Science at the University of Pretoria and a co-organizer of the Indaba. My conversation with Vuko...

23 Okt 201846min

AI Ethics, Strategic Decisioning and Game Theory with Osonde Osoba - TWiML Talk #192

AI Ethics, Strategic Decisioning and Game Theory with Osonde Osoba - TWiML Talk #192

In this episode of our Deep Learning Indaba Series, we’re joined by Osonde Osoba, Engineer at RAND Corporation. Osonde and I spoke on the heels of the Indaba, where he presented on AI Ethics and Poli...

18 Okt 201847min

Acoustic Word Embeddings for Low Resource Speech Processing with Herman Kamper - TWiML Talk #191

Acoustic Word Embeddings for Low Resource Speech Processing with Herman Kamper - TWiML Talk #191

In this episode of our Deep Learning Indaba Series, we’re joined by Herman Kamper, lecturer at Stellenbosch University in SA and a co-organizer of the Indaba. We discuss his work on limited- and zero...

16 Okt 20181h 1min

Learning Representations for Visual Search with Naila Murray - TWiML Talk #190

Learning Representations for Visual Search with Naila Murray - TWiML Talk #190

In this episode of our Deep Learning Indaba series, we’re joined by Naila Murray, Senior Research Scientist and Group Lead in the computer vision group at Naver Labs Europe. Naila presented at the In...

12 Okt 201841min

Evaluating Model Explainability Methods with Sara Hooker - TWiML Talk #189

Evaluating Model Explainability Methods with Sara Hooker - TWiML Talk #189

In this, the first episode of the Deep Learning Indaba series, we’re joined by Sara Hooker, AI Resident at Google Brain. I spoke with Sara in the run-up to the Indaba about her work on interpretabilit...

10 Okt 20181h 3min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
popradet
stopp-verden
det-store-bildet
bt-dokumentar-2
rss-gukild-johaug
dine-penger-pengeradet
nokon-ma-ga
lydartikler-fra-aftenposten
fotballpodden-2
hanna-de-heldige
frokostshowet-pa-p5
rss-penger-polser-og-politikk
aftenbla-bla
e24-podden
rss-dannet-uten-piano
rss-ness