Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(778)

Philosophy of Intelligence with Matthew Crosby - TWiML Talk #91

Philosophy of Intelligence with Matthew Crosby - TWiML Talk #91

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

21 Dec 201729min

Geometric Deep Learning with Joan Bruna & Michael Bronstein - TWiML Talk #90

Geometric Deep Learning with Joan Bruna & Michael Bronstein - TWiML Talk #90

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

20 Dec 201740min

AI at the NASA Frontier Development Lab with Sara Jennings, Timothy Seabrook and Andres Rodriguez

AI at the NASA Frontier Development Lab with Sara Jennings, Timothy Seabrook and Andres Rodriguez

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

19 Dec 201736min

Using Deep Learning and Google Street View to Estimate Demographics with Timnit Gebru

Using Deep Learning and Google Street View to Estimate Demographics with Timnit Gebru

This week on the podcast we’re featuring a series of conversations from the NIPs conference in Long Beach, California. I attended a bunch of talks and learned a ton, organized an impromptu roundtable ...

19 Dec 201732min

Integrative Learning for Robotic Systems with Aaron Ames - TWiML Talk #87

Integrative Learning for Robotic Systems with Aaron Ames - TWiML Talk #87

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

15 Dec 201747min

Visual Recognition in the Cloud for Law Enforcement with Chris Adzima - TWiML Talk #86

Visual Recognition in the Cloud for Law Enforcement with Chris Adzima - TWiML Talk #86

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

14 Dec 201735min

Embodied Visual Learning with Kristen Grauman - TWiML Talk #85

Embodied Visual Learning with Kristen Grauman - TWiML Talk #85

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

13 Dec 201739min

Real-Time Machine Learning in the Database with Nikita Shamgunov - TWiML Talk #84

Real-Time Machine Learning in the Database with Nikita Shamgunov - TWiML Talk #84

This week on the podcast we’re featuring a series of conversations from the AWS re:Invent conference in Las Vegas. I had a great time at this event getting caught up on the latest and greatest machine...

12 Dec 201739min

Populärt inom Politik & nyheter

p3-krim
rss-krimstad
svenska-fall
rss-viva-fotboll
flashback-forever
motiv
aftonbladet-daily
rss-vad-fan-hande
rss-sanning-konsekvens
aftonbladet-krim
rss-krimreportrarna
olyckan-inifran
rss-frandfors-horna
fordomspodden
dagens-eko
spar
rss-flodet
blenda-2
politiken
rss-klubbland-en-podd-mest-om-frolunda