Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Data Innovation & AI at Capital One with Adam Wenchel - TWiML Talk #147

Data Innovation & AI at Capital One with Adam Wenchel - TWiML Talk #147

In this episode I’m joined by Adam Wenchel, vice president of AI and Data Innovation at Capital One, to discuss how Machine Learning & AI are being integrated into their day-to-day practices, and how ...

4 Kesä 201845min

Deep Gradient Compression for Distributed Training with Song Han - TWiML Talk #146

Deep Gradient Compression for Distributed Training with Song Han - TWiML Talk #146

On today’s show I chat with Song Han, assistant professor in MIT’s EECS department, about his research on Deep Gradient Compression. In our conversation, we explore the challenge of distributed traini...

31 Touko 201846min

Masked Autoregressive Flow for Density Estimation with George Papamakarios - TWiML Talk #145

Masked Autoregressive Flow for Density Estimation with George Papamakarios - TWiML Talk #145

In this episode, University of Edinburgh Phd student George Papamakarios and I discuss his paper “Masked Autoregressive Flow for Density Estimation.” George walks us through the idea of Masked Autoreg...

28 Touko 201834min

Training Data for Computer Vision at Figure Eight with Qazaleh Mirsharif - TWiML Talk #144

Training Data for Computer Vision at Figure Eight with Qazaleh Mirsharif - TWiML Talk #144

For today’s show, the last in our TrainAI series, I'm joined by Qazaleh Mirsharif, a machine learning scientist working on computer vision at Figure Eight. Qazaleh and I caught up at the TrainAI confe...

25 Touko 201821min

Agile Data Science with Sarah Aerni - TWiML Talk #143

Agile Data Science with Sarah Aerni - TWiML Talk #143

Today we continue our TrainAI series with Sarah Aerni, Director of Data Science at Salesforce Einstein. Sarah and I sat down at the TrainAI conference to discuss her talk “Notes from the Field: The Pl...

24 Touko 201838min

Tensor Operations for Machine Learning with Anima Anandkumar - TWiML Talk #142

Tensor Operations for Machine Learning with Anima Anandkumar - TWiML Talk #142

In this episode of our TrainAI series, I sit down with Anima Anandkumar, Bren Professor at Caltech and Principal Scientist with Amazon Web Services. Anima joined me to discuss the research coming out ...

23 Touko 201834min

Deep Learning for Live-Cell Imaging with David Van Valen - TWiML Talk #141

Deep Learning for Live-Cell Imaging with David Van Valen - TWiML Talk #141

In today’s show, I sit down with David Van Valen, assistant professor of Bioengineering & Biology at Caltech. David joined me after his talk at the Figure Eight TrainAI conference to chat about his re...

22 Touko 201837min

Checking in with the Master w/ Garry Kasparov - TWiML Talk #140

Checking in with the Master w/ Garry Kasparov - TWiML Talk #140

In this episode I’m joined by legendary chess champion, author, and fellow at the Oxford Martin School, Garry Kasparov. Garry and I sat down after his keynote at the Figure Eight Train AI conference i...

21 Touko 201832min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
linda-maria
io-techin-tekniikkapodcast
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rikosmyytit
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
viela-yksi-sivu
rss-uusi-juttu
rss-aika-ankkuri
rss-kaikki-uusiksi
rss-merja-mahkan-rahat