Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(779)

Taskonomy: Disentangling Transfer Learning for Perception (CVPR 2018 Best Paper Winner) with Amir Zamir - TWiML Talk #164

Taskonomy: Disentangling Transfer Learning for Perception (CVPR 2018 Best Paper Winner) with Amir Zamir - TWiML Talk #164

In this episode I'm joined by Amir Zamir, Postdoctoral researcher at both Stanford & UC Berkeley, who joins us fresh off of winning the 2018 CVPR Best Paper Award for co-authoring "Taskonomy: Disentan...

16 Juli 201847min

Predicting Metabolic Pathway Dynamics w/ Machine Learning with Zak Costello - TWiML Talk #163

Predicting Metabolic Pathway Dynamics w/ Machine Learning with Zak Costello - TWiML Talk #163

In today’s episode I’m joined by Zak Costello, post-doctoral fellow at the Joint BioEnergy Institute to discuss his recent paper, “A machine learning approach to predict metabolic pathway dynamics fro...

11 Juli 201839min

Machine Learning to Discover Physics and Engineering Principles with Nathan Kutz - TWiML Talk #162

Machine Learning to Discover Physics and Engineering Principles with Nathan Kutz - TWiML Talk #162

In this episode, I’m joined by Nathan Kutz, Professor of applied mathematics, electrical engineering and physics at the University of Washington to discuss his research into the use of machine learnin...

9 Juli 201843min

Automating Complex Internal Processes w/ AI with Alexander Chukovski - TWiML Talk #161

Automating Complex Internal Processes w/ AI with Alexander Chukovski - TWiML Talk #161

In this episode, I'm joined by Alexander Chukovski, Director of Data Services at Munich, Germany based career platform, Experteer. In our conversation, we explore Alex’s journey to implement machine l...

5 Juli 201839min

Designing Better Sequence Models with RNNs with Adji Bousso Dieng - TWiML Talk #160

Designing Better Sequence Models with RNNs with Adji Bousso Dieng - TWiML Talk #160

In this episode, I'm joined by Adji Bousso Dieng, PhD Student in the Department of Statistics at Columbia University to discuss two of her recent papers, “Noisin: Unbiased Regularization for Recurrent...

2 Juli 201838min

Love Love: AI and ML in Tennis with Stephanie Kovalchik - TWiML Talk #159

Love Love: AI and ML in Tennis with Stephanie Kovalchik - TWiML Talk #159

In the final show in our AI in Sports series, I’m joined by Stephanie Kovalchik, Research Fellow at Victoria University and Senior Sports Scientist at Tennis Australia. In our conversation we discuss...

29 Juni 201846min

Growth Hacking Sports w/ Machine Learning with Noah Gift - TWiML Talk #158

Growth Hacking Sports w/ Machine Learning with Noah Gift - TWiML Talk #158

In this episode of our AI in Sports series I'm joined by Noah Gift, Founder and Consulting CTO at Pragmatic Labs and professor at UC Davis. Noah and I discuss some of his recent work in using social m...

28 Juni 201850min

Fine-Grained Player Prediction in Sports with Jennifer Hobbs - TWiML Talk #157

Fine-Grained Player Prediction in Sports with Jennifer Hobbs - TWiML Talk #157

In this episode of our AI in Sports series, I'm joined by Jennifer Hobbs, Senior Data Scientist at STATS, a collector and distributor of sports data, to discuss the STATS data pipeline and how they co...

27 Juni 201842min

Populärt inom Politik & nyheter

motiv
p3-krim
spar
svenska-fall
flashback-forever
rss-krimstad
rss-viva-fotboll
rss-sanning-konsekvens
aftonbladet-daily
aftonbladet-krim
rss-vad-fan-hande
rss-krimreportrarna
olyckan-inifran
rss-frandfors-horna
fordomspodden
dagens-eko
rss-flodet
svd-ledarredaktionen
politiken
rss-aftonbladet-krim