Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(781)

ML Use Cases at Think Big Analytics with Mo Patel and Laura Frølich - TWiML Talk #54

ML Use Cases at Think Big Analytics with Mo Patel and Laura Frølich - TWiML Talk #54

The show you’re about to hear is part of a series of shows recorded in San Francisco at the Artificial Intelligence Conference. This time around, I speak with Mo Patel, practice director of AI & deep ...

6 Okt 201745min

Intel Nervana Devcloud with Naveen Rao & Scott Apeland - TWiML Talk #51

Intel Nervana Devcloud with Naveen Rao & Scott Apeland - TWiML Talk #51

In this episode, I talk to Naveen Rao, VP and GM of Intel’s AI Products Group, and Scott Apeland, director of Intel’s Developer Network. It's been a few months since we last spoke to Naveen, so he giv...

6 Okt 201737min

Ray: A Distributed Computing Platform for Reinforcement Learning with Ion Stoica - TWiML Talk #55

Ray: A Distributed Computing Platform for Reinforcement Learning with Ion Stoica - TWiML Talk #55

The show you’re about to hear is part of a series of shows recorded in San Francisco at the Artificial Intelligence Conference. In this episode, I talk with Ion Stoica, professor of computer science &...

5 Okt 201728min

Topological Data Analysis with Gunnar Carlsson - TWiML Talk #53

Topological Data Analysis with Gunnar Carlsson - TWiML Talk #53

The show you’re about to hear is part of a series of shows recorded in San Francisco at the Artificial Intelligence Conference. My guest for this show is Gunnar Carlsson, professor emeritus of mathema...

3 Okt 201733min

Bayesian Optimization for Hyperparameter Tuning with Scott Clark - TWiML Talk #50

Bayesian Optimization for Hyperparameter Tuning with Scott Clark - TWiML Talk #50

As you all know, a few weeks ago, I spent some time in SF at the Artificial Intelligence Conference. While I was there, I had just enough time to sneak away and catch up with Scott Clark, Co-Founder a...

2 Okt 201747min

Symbolic and Sub-Symbolic Natural Language Processing with Jonathan Mugan - TWiML Talk #49

Symbolic and Sub-Symbolic Natural Language Processing with Jonathan Mugan - TWiML Talk #49

Like last week’s interview with Bruno Goncalves, this week’s interview was also recorded at the last O’Reilly AI Conference back in New York in June. Also like last week’s show, this week’s is also fo...

25 Sep 201743min

Word2Vec & Friends with Bruno Gonçalves - TWiML Talk #48

Word2Vec & Friends with Bruno Gonçalves - TWiML Talk #48

This week i'm bringing you an interview from Bruno Goncalves, a Moore-Sloan Data Science Fellow at NYU. As you’ll hear in the interview, Bruno is a longtime listener of the podcast. We were able to co...

19 Sep 201732min

Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - TWiML Talk #47

Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - TWiML Talk #47

My guest this week is Risto Miikkulainen, professor of computer science at UT-Austin and vice president of Research at Sentient Technologies. Risto came locked and loaded to discuss a topic that we've...

11 Sep 201758min

Populärt inom Politik & nyheter

svenska-fall
aftonbladet-krim
p3-krim
rss-krimstad
fordomspodden
rss-expressen-dok
flashback-forever
motiv
aftonbladet-daily
rss-sanning-konsekvens
spar
blenda-2
rss-vad-fan-hande
olyckan-inifran
svd-ledarredaktionen
rss-krimreportrarna
rss-frandfors-horna
dagens-eko
rss-flodet
kungligt