Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Exploring AI-Generated Music with Taryn Southern - TWiML Talk #139

Exploring AI-Generated Music with Taryn Southern - TWiML Talk #139

In this episode I’m joined by Taryn Southern - a singer, digital storyteller and Youtuber, whose upcoming album I AM AI will be produced completely with AI based tools. Taryn and I explore all aspects...

17 Touko 201833min

Practical Deep Learning with Rachel Thomas - TWiML Talk #138

Practical Deep Learning with Rachel Thomas - TWiML Talk #138

In this episode, i'm joined by Rachel Thomas, founder and researcher at Fast AI. If you’re not familiar with Fast AI, the company offers a series of courses including Practical Deep Learning for Coder...

14 Touko 201844min

Kinds of Intelligence w/ Jose Hernandez-Orallo - TWiML Talk #137

Kinds of Intelligence w/ Jose Hernandez-Orallo - TWiML Talk #137

In this episode, I'm joined by Jose Hernandez-Orallo, professor in the department of information systems and computing at Universitat Politècnica de València and fellow at the Leverhulme Centre for th...

10 Touko 201844min

Taming arXiv with Natural Language Processing w/ John Bohannon - TWiML Talk #136

Taming arXiv with Natural Language Processing w/ John Bohannon - TWiML Talk #136

In this episode i'm joined by John Bohannan, Director of Science at AI startup Primer. As you all may know, a few weeks ago we released my interview with Google legend Jeff Dean, which, by the way, yo...

7 Touko 201854min

Epsilon Software for Private Machine Learning with Chang Liu - TWiML Talk #135

Epsilon Software for Private Machine Learning with Chang Liu - TWiML Talk #135

In this episode, our final episode in the Differential Privacy series, I speak with Chang Liu, applied research scientist at Georgian Partners, a venture capital firm that invests in growth stage busi...

4 Touko 201846min

Scalable Differential Privacy for Deep Learning with Nicolas Papernot - TWiML Talk #134

Scalable Differential Privacy for Deep Learning with Nicolas Papernot - TWiML Talk #134

In this episode of our Differential Privacy series, I'm joined by Nicolas Papernot, Google PhD Fellow in Security and graduate student in the department of computer science at Penn State University. N...

3 Touko 201859min

Differential Privacy at Bluecore with Zahi Karam - TWiML Talk #133

Differential Privacy at Bluecore with Zahi Karam - TWiML Talk #133

In this episode of our Differential Privacy series, I'm joined by Zahi Karam, Director of Data Science at Bluecore, whose retail marketing platform specializes in personalized email marketing. I sat d...

1 Touko 201838min

Differential Privacy Theory & Practice with Aaron Roth - TWiML Talk #132

Differential Privacy Theory & Practice with Aaron Roth - TWiML Talk #132

In the first episode of our Differential Privacy series, I'm joined by Aaron Roth, associate professor of computer science and information science at the University of Pennsylvania. Aaron is first and...

30 Huhti 201842min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
linda-maria
io-techin-tekniikkapodcast
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rikosmyytit
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
viela-yksi-sivu
rss-uusi-juttu
rss-aika-ankkuri
rss-kaikki-uusiksi
rss-merja-mahkan-rahat