Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Episoder(778)

Philosophy of Intelligence with Matthew Crosby - TWiML Talk #91

Philosophy of Intelligence with Matthew Crosby - TWiML Talk #91

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

21 Des 201729min

Geometric Deep Learning with Joan Bruna & Michael Bronstein - TWiML Talk #90

Geometric Deep Learning with Joan Bruna & Michael Bronstein - TWiML Talk #90

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

20 Des 201740min

AI at the NASA Frontier Development Lab with Sara Jennings, Timothy Seabrook and Andres Rodriguez

AI at the NASA Frontier Development Lab with Sara Jennings, Timothy Seabrook and Andres Rodriguez

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

19 Des 201736min

Using Deep Learning and Google Street View to Estimate Demographics with Timnit Gebru

Using Deep Learning and Google Street View to Estimate Demographics with Timnit Gebru

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

19 Des 201732min

Integrative Learning for Robotic Systems with Aaron Ames - TWiML Talk #87

Integrative Learning for Robotic Systems with Aaron Ames - TWiML Talk #87

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

15 Des 201747min

Visual Recognition in the Cloud for Law Enforcement with Chris Adzima - TWiML Talk #86

Visual Recognition in the Cloud for Law Enforcement with Chris Adzima - TWiML Talk #86

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

14 Des 201735min

Embodied Visual Learning with Kristen Grauman - TWiML Talk #85

Embodied Visual Learning with Kristen Grauman - TWiML Talk #85

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

13 Des 201739min

Real-Time Machine Learning in the Database with Nikita Shamgunov - TWiML Talk #84

Real-Time Machine Learning in the Database with Nikita Shamgunov - TWiML Talk #84

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

12 Des 201739min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
stopp-verden
popradet
det-store-bildet
fotballpodden-2
dine-penger-pengeradet
rss-gukild-johaug
bt-dokumentar-2
nokon-ma-ga
lydartikler-fra-aftenposten
aftenbla-bla
hanna-de-heldige
rss-dannet-uten-piano
e24-podden
frokostshowet-pa-p5
rss-ness
rss-penger-polser-og-politikk