Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Are Vector DBs the Future Data Platform for AI? with Ed Anuff - #664

Are Vector DBs the Future Data Platform for AI? with Ed Anuff - #664

Today we’re joined by Ed Anuff, chief product officer at DataStax. In our conversation, we discuss Ed’s insights on RAG, vector databases, embedding models, and more. We dig into the underpinnings of ...

28 Joulu 202348min

Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663

Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663

Today we’re joined by Markus Nagel, research scientist at Qualcomm AI Research, who helps us kick off our coverage of NeurIPS 2023. In our conversation with Markus, we cover his accepted papers at the...

26 Joulu 202346min

Responsible AI in the Generative Era with Michael Kearns - #662

Responsible AI in the Generative Era with Michael Kearns - #662

Today we’re joined by Michael Kearns, professor in the Department of Computer and Information Science at the University of Pennsylvania and an Amazon scholar. In our conversation with Michael, we disc...

22 Joulu 202336min

Edutainment for AI and AWS PartyRock with Mike Miller - #661

Edutainment for AI and AWS PartyRock with Mike Miller - #661

Today we’re joined by Mike Miller, director of product at AWS responsible for the company’s “edutainment” products. In our conversation with Mike, we explore AWS PartyRock, a no-code generative AI app...

18 Joulu 202329min

Data, Systems and ML for Visual Understanding with Cody Coleman - #660

Data, Systems and ML for Visual Understanding with Cody Coleman - #660

Today we’re joined by Cody Coleman, co-founder and CEO of Coactive AI. In our conversation with Cody, we discuss how Coactive has leveraged modern data, systems, and machine learning techniques to del...

14 Joulu 202338min

Patterns and Middleware for LLM Applications with Kyle Roche - #659

Patterns and Middleware for LLM Applications with Kyle Roche - #659

Today we’re joined by Kyle Roche, founder and CEO of Griptape to discuss patterns and middleware for LLM applications. We dive into the emerging patterns for developing LLM applications, such as off p...

11 Joulu 202335min

AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658

AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658

Today we’re joined by Prem Natarajan, chief scientist and head of enterprise AI at Capital One. In our conversation, we discuss AI access and inclusivity as technical challenges and explore some of Pr...

4 Joulu 202341min

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657

Today we’re joined by Jay Emery, director of technical sales & architecture at Microsoft Azure. In our conversation with Jay, we discuss the challenges faced by organizations when building LLM-based a...

28 Marras 202343min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
linda-maria
io-techin-tekniikkapodcast
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rikosmyytit
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
viela-yksi-sivu
rss-uusi-juttu
rss-aika-ankkuri
rss-kaikki-uusiksi
rss-merja-mahkan-rahat