Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(781)

Robotic Perception and Control with Chelsea Finn - TWiML Talk #29

Robotic Perception and Control with Chelsea Finn - TWiML Talk #29

This week we continue our series on industrial applications of machine learning and AI with a conversation with Chelsea Finn, a PhD student at UC Berkeley. Chelsea’s research is focused on machine lea...

23 Juni 201754min

Reinforcement Learning Deep Dive with Pieter Abbeel - TWiML Talk #28

Reinforcement Learning Deep Dive with Pieter Abbeel - TWiML Talk #28

This week our guest is Pieter Abbeel, Assistant Professor at UC Berkeley, Research Scientist at OpenAI, and Cofounder of Gradescope. Pieter has an extensive background in AI research, going way back t...

17 Juni 201752min

Intelligent Autonomous Robots with Ilia Baranov - TWiML Talk #27

Intelligent Autonomous Robots with Ilia Baranov - TWiML Talk #27

Our first guest in the Industrial AI series is Ilia Baranov, engineering manager at Clearpath Robotics. Ilia is responsible for setting the engineering direction for all of Clearpath’s research platfo...

9 Juni 201753min

Global AI Trends with Ben Lorica - TWiML Talk #26

Global AI Trends with Ben Lorica - TWiML Talk #26

This week I’ve invited my friend Ben Lorica onto the show. Ben is Chief Data Scientist for O’Reilly Media, and Program Director of Strata Data & the O'Reilly A.I. conference. Ben has worked on analyti...

2 Juni 201754min

Offensive vs Defensive Data Science with Deep Varma - TWiML Talk #25

Offensive vs Defensive Data Science with Deep Varma - TWiML Talk #25

This week on the show my guest is Deep Varma, Vice President of Data Engineering at real estate startup Trulia. Deep has run data engineering teams in silicon valley for well over a decade, and is now...

26 Maj 201753min

Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - TWiML Talk #24

Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - TWiML Talk #24

My guest on the show this week is Danny Lange, VP for Machine Learning & AI at video game technology developer Unity Technologies. Danny is well traveled in the world of ML and AI, and has had a hand ...

20 Maj 201754min

Integrating Psycholinguistics into AI with Dominique Simmons - TWiML Talk #23

Integrating Psycholinguistics into AI with Dominique Simmons - TWiML Talk #23

I think you’re really going to enjoy today’s show. Our guest this week is Dominique Simmons, Applied research Scientist at AI tools vendor Dimensional Mechanics. Dominique brings an interesting backgr...

12 Maj 20171h

Deep Neural Nets for Visual Recognition with Matt Zeiler - TWiML Talk #22

Deep Neural Nets for Visual Recognition with Matt Zeiler - TWiML Talk #22

Today we bring you our final interview from backstage at the NYU FutureLabs AI Summit. Our guest this week is Matt Zeiler. Matt graduated from the University of Toronto where he worked with deep learn...

5 Maj 201722min

Populärt inom Politik & nyheter

svenska-fall
aftonbladet-krim
p3-krim
rss-krimstad
fordomspodden
rss-expressen-dok
flashback-forever
motiv
aftonbladet-daily
rss-sanning-konsekvens
spar
blenda-2
rss-vad-fan-hande
olyckan-inifran
svd-ledarredaktionen
rss-krimreportrarna
rss-frandfors-horna
dagens-eko
rss-flodet
kungligt