Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(779)

Intel Nervana Update + Productizing AI Research with Naveen Rao And Hanlin Tang - TWiML Talk #31

Intel Nervana Update + Productizing AI Research with Naveen Rao And Hanlin Tang - TWiML Talk #31

I talked about Intel’s acquisition of Nervana Systems on the podcast when it happened almost a year ago, so I was super excited to have an opportunity to sit down with Nervana co-founder Naveen Rao, w...

5 Juli 201738min

Expressive AI - Generated Music With Google's Performance RNN - Doug Eck - TWiML Talk #32

Expressive AI - Generated Music With Google's Performance RNN - Doug Eck - TWiML Talk #32

My guest for this second show in our O’Reilly AI series is Doug Eck of Google Brain. Doug did a keynote at the O’Reilly conference on Magenta, Google’s project for melding machine learning and the art...

5 Juli 201746min

The Power Of Probabilistic Programming with Ben Vigoda - TWiML Talk #33

The Power Of Probabilistic Programming with Ben Vigoda - TWiML Talk #33

My guest for this third episode in the O'Reilly AI series is Ben Vigoda. Ben is the founder and CEO of Gamalon, a DARPA-funded startup working on Bayesian Program Synthesis. We dive into what exactly ...

5 Juli 201742min

Video Object Detection At Scale with Reza Zadeh - TWiML Talk #34

Video Object Detection At Scale with Reza Zadeh - TWiML Talk #34

My guest for the fourth show in the O'Reilly AI Series is Reza Zadeh. Reza is an adjunct professor of computational mathematics at Stanford University and founder and CEO of the startup Matroid. Reza ...

5 Juli 201752min

Enhancing Customer Experiences With Emotional AI with Rana El Kaliouby - TWiML Talk #35

Enhancing Customer Experiences With Emotional AI with Rana El Kaliouby - TWiML Talk #35

My guest for this show is Rana el Kaliouby. Rana is co-founder and CEO of Affectiva. Affectiva, as Rana puts it, "is on a mission to humanize technology by bringing in artificial emotional intelligenc...

5 Juli 201733min

Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - TWiML Talk #30

Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - TWiML Talk #30

Our guest this week is Zornitsa Kozareva, Manager of Machine Learning with Amazon Web Services Deep Learning, where she leads a group focused on natural language processing and dialogue systems for pr...

29 Juni 201755min

Robotic Perception and Control with Chelsea Finn - TWiML Talk #29

Robotic Perception and Control with Chelsea Finn - TWiML Talk #29

This week we continue our series on industrial applications of machine learning and AI with a conversation with Chelsea Finn, a PhD student at UC Berkeley. Chelsea’s research is focused on machine lea...

23 Juni 201754min

Reinforcement Learning Deep Dive with Pieter Abbeel - TWiML Talk #28

Reinforcement Learning Deep Dive with Pieter Abbeel - TWiML Talk #28

This week our guest is Pieter Abbeel, Assistant Professor at UC Berkeley, Research Scientist at OpenAI, and Cofounder of Gradescope. Pieter has an extensive background in AI research, going way back t...

17 Juni 201752min

Populärt inom Politik & nyheter

motiv
p3-krim
spar
flashback-forever
svenska-fall
rss-viva-fotboll
rss-krimstad
aftonbladet-daily
aftonbladet-krim
rss-sanning-konsekvens
rss-vad-fan-hande
olyckan-inifran
dagens-eko
fordomspodden
rss-aftonbladet-krim
svd-ledarredaktionen
rss-frandfors-horna
blenda-2
rss-klubbland-en-podd-mest-om-frolunda
rss-krimreportrarna