Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Machine Learning Platforms at Uber with Mike Del Balso - TWiML Talk #115

Machine Learning Platforms at Uber with Mike Del Balso - TWiML Talk #115

In this episode, I speak with Mike Del Balso, Product Manager for Machine Learning Platforms at Uber. Mike and I sat down last fall at the Georgian Partners Portfolio conference to discuss his present...

1 Maalis 201849min

Inverse Programming for Deeper AI with Zenna Tavares - TWiML Talk #114

Inverse Programming for Deeper AI with Zenna Tavares - TWiML Talk #114

For today’s show, the final episode of our Black in AI Series, I’m joined by Zenna Tavares, a PhD student in the both the department of Brain and Cognitive Sciences and the Computer Science and Artifi...

26 Helmi 201828min

Statistical Relational Artificial Intelligence with Sriraam Natarajan - TWiML Talk #113

Statistical Relational Artificial Intelligence with Sriraam Natarajan - TWiML Talk #113

In this episode, I speak with Sriraam Natarajan, Associate Professor in the Department of Computer Science at UT Dallas. While at NIPS a few months back, Sriraam and I sat down to discuss his work on ...

23 Helmi 201847min

Classical Machine Learning for Infant Medical Diagnosis with Charles Onu - TWiML Talk #112

Classical Machine Learning for Infant Medical Diagnosis with Charles Onu - TWiML Talk #112

In this episode, part 4 in our Black in AI series, i'm joined by Charles Onu, Phd Student at McGill University in Montreal & Founder of Ubenwa, a startup tackling the problem of infant mortality due t...

20 Helmi 201848min

Learning "Common Sense" and Physical Concepts with Roland Memisevic - TWiML Talk #111

Learning "Common Sense" and Physical Concepts with Roland Memisevic - TWiML Talk #111

In today’s episode, I’m joined by Roland Memisevic, co-founder, CEO, and chief scientist at Twenty Billion Neurons. Roland joined me at the RE•WORK Deep Learning Summit in Montreal to discuss the work...

15 Helmi 201832min

Trust in Human-Robot/AI Interactions with Ayanna Howard - TWiML Talk #110

Trust in Human-Robot/AI Interactions with Ayanna Howard - TWiML Talk #110

In this episode, the third in our Black in AI series, I speak with Ayanna Howard, Chair of the Interactive School of Computing at Georgia Tech. Ayanna joined me for a lively discussion about her work ...

13 Helmi 201846min

Data Science for Poaching Prevention and Disease Treatment with Nyalleng Moorosi - TWiML Talk #109

Data Science for Poaching Prevention and Disease Treatment with Nyalleng Moorosi - TWiML Talk #109

For today’s show, I'm joined by Nyalleng Moorosi, Senior Data Science Researcher at The Council for Scientific & Industrial Research or CSIR, in Pretoria, South Africa. In our discussion, we discuss t...

8 Helmi 201852min

Security and Safety in AI: Adversarial Examples, Bias and Trust w/ Moustapha Cissé - TWiML Talk #108

Security and Safety in AI: Adversarial Examples, Bias and Trust w/ Moustapha Cissé - TWiML Talk #108

In this episode I’m joined by Moustapha Cissé, Research Scientist at Facebook AI Research Lab (or FAIR) Paris. Moustapha’s broad research interests include the security and safety of AI systems, and w...

6 Helmi 201850min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
otetaan-yhdet
et-sa-noin-voi-sanoo-esittaa
rss-podme-livebox
io-techin-tekniikkapodcast
linda-maria
rikosmyytit
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rss-asiastudio
the-ulkopolitist
rss-uusi-juttu
rss-fi-lainsaadanto-paremmaksi
rss-hyvaa-huomenta-bryssel
rss-50100-podcast