Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Episoder(778)

re:Invent Roundup Roundtable - TWiML Talk # 83

re:Invent Roundup Roundtable - TWiML Talk # 83

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

11 Des 20171h 6min

Driving Customer Loyalty with Predictive and Conversational AI with Sherif Mityas - TWiML Talk #82

Driving Customer Loyalty with Predictive and Conversational AI with Sherif Mityas - TWiML Talk #82

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

8 Des 201736min

Innovation Factories for AI in FInancial Services with Thierry Derungs - TWiML Talk #81

Innovation Factories for AI in FInancial Services with Thierry Derungs - TWiML Talk #81

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

7 Des 201740min

Block-Sparse Kernels for Deep Neural Networks with Durk Kingma - TWiML Talk #80

Block-Sparse Kernels for Deep Neural Networks with Durk Kingma - TWiML Talk #80

The show is part of a series that I’m really excited about, in part because I’ve been working to bring them to you for quite a while now. The focus of the series is a sampling of the interesting work ...

7 Des 201744min

AI for Customer Service and Marketing at Aeromexico with Brian Gross - TWiML Talk #79

AI for Customer Service and Marketing at Aeromexico with Brian Gross - TWiML Talk #79

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

6 Des 201729min

Scaling AI for the Enterprise with Mazin Gilbert - TWiML Talk #78

Scaling AI for the Enterprise with Mazin Gilbert - TWiML Talk #78

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

5 Des 201749min

Scalable Distributed Deep Learning with Hillery Hunter - TWiML Talk #77

Scalable Distributed Deep Learning with Hillery Hunter - TWiML Talk #77

This week on the podcast we’re running a series of shows consisting of conversations with some of the impressive speakers from an event called the AI Summit in New York City. The theme of the conferen...

4 Des 201738min

Robotics at OpenAI with Jonas Schneider - TWiML Talk #76

Robotics at OpenAI with Jonas Schneider - TWiML Talk #76

The show is part of a series that I’m really excited about, in part because I’ve been working to bring them to you for quite a while now. The focus of the series is a sampling of the interesting work ...

1 Des 201745min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
stopp-verden
popradet
det-store-bildet
fotballpodden-2
dine-penger-pengeradet
rss-gukild-johaug
bt-dokumentar-2
nokon-ma-ga
lydartikler-fra-aftenposten
aftenbla-bla
hanna-de-heldige
rss-dannet-uten-piano
e24-podden
frokostshowet-pa-p5
rss-ness
rss-penger-polser-og-politikk