Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(779)

Reinforcement Learning for Personalization at Spotify with Tony Jebara - #609

Reinforcement Learning for Personalization at Spotify with Tony Jebara - #609

Today we continue our NeurIPS 2022 series joined by Tony Jebara, VP of engineering and head of machine learning at Spotify. In our conversation with Tony, we discuss his role at Spotify and how the co...

29 Joulu 202241min

Will ChatGPT take my job? - #608

Will ChatGPT take my job? - #608

More than any system before it, ChatGPT has tapped into our enduring fascination with artificial intelligence, raising in a more concrete and present way important questions and fears about what AI is...

26 Joulu 202237min

Geospatial Machine Learning at AWS with Kumar Chellapilla - #607

Geospatial Machine Learning at AWS with Kumar Chellapilla - #607

Today we continue our re:Invent 2022 series joined by Kumar Chellapilla, a general manager of ML and AI Services at AWS. We had the opportunity to speak with Kumar after announcing their recent additi...

22 Joulu 202236min

Real-Time ML Workflows at Capital One with Disha Singla - #606

Real-Time ML Workflows at Capital One with Disha Singla - #606

Today we’re joined by Disha Singla, a senior director of machine learning engineering at Capital One. In our conversation with Disha, we explore her role as the leader of the Data Insights team at Cap...

19 Joulu 202243min

Weakly Supervised Causal Representation Learning with Johann Brehmer - #605

Weakly Supervised Causal Representation Learning with Johann Brehmer - #605

Today we’re excited to kick off our coverage of the 2022 NeurIPS conference with Johann Brehmer, a research scientist at Qualcomm AI Research in Amsterdam. We begin our conversation discussing some of...

15 Joulu 202246min

Stable Diffusion & Generative AI with Emad Mostaque - #604

Stable Diffusion & Generative AI with Emad Mostaque - #604

Today we’re excited to kick off our 2022 AWS re:Invent series with a conversation with Emad Mostaque, Founder and CEO of Stability.ai. Stability.ai is a very popular name in the generative AI space at...

12 Joulu 202242min

Exploring Large Language Models with ChatGPT - #603

Exploring Large Language Models with ChatGPT - #603

Today we're joined by ChatGPT, the latest and coolest large language model developed by OpenAl. In our conversation with ChatGPT, we discuss the background and capabilities of large language models, t...

8 Joulu 202236min

Accelerating Intelligence with AI-Generating Algorithms with Jeff Clune - #602

Accelerating Intelligence with AI-Generating Algorithms with Jeff Clune - #602

Are AI-generating algorithms the path to artificial general intelligence(AGI)?  Today we’re joined by Jeff Clune, an associate professor of computer science at the University of British Columbia, and...

5 Joulu 202256min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
rss-ootsa-kuullut-tasta
tervo-halme
ootsa-kuullut-tasta-2
politiikan-puskaradio
rss-vaalirankkurit-podcast
viisupodi
rss-podme-livebox
otetaan-yhdet
et-sa-noin-voi-sanoo-esittaa
rss-asiastudio
the-ulkopolitist
mtv-uutiset-polloraati
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
rss-kaikki-uusiksi
rss-hyvaa-huomenta-bryssel
rss-merja-mahkan-rahat
rss-kuka-mina-olen
rss-raha-talous-ja-politiikka
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset