Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Predicting Metabolic Pathway Dynamics w/ Machine Learning with Zak Costello - TWiML Talk #163

Predicting Metabolic Pathway Dynamics w/ Machine Learning with Zak Costello - TWiML Talk #163

In today’s episode I’m joined by Zak Costello, post-doctoral fellow at the Joint BioEnergy Institute to discuss his recent paper, “A machine learning approach to predict metabolic pathway dynamics fro...

11 Heinä 201839min

Machine Learning to Discover Physics and Engineering Principles with Nathan Kutz - TWiML Talk #162

Machine Learning to Discover Physics and Engineering Principles with Nathan Kutz - TWiML Talk #162

In this episode, I’m joined by Nathan Kutz, Professor of applied mathematics, electrical engineering and physics at the University of Washington to discuss his research into the use of machine learnin...

9 Heinä 201843min

Automating Complex Internal Processes w/ AI with Alexander Chukovski - TWiML Talk #161

Automating Complex Internal Processes w/ AI with Alexander Chukovski - TWiML Talk #161

In this episode, I'm joined by Alexander Chukovski, Director of Data Services at Munich, Germany based career platform, Experteer. In our conversation, we explore Alex’s journey to implement machine l...

5 Heinä 201839min

Designing Better Sequence Models with RNNs with Adji Bousso Dieng - TWiML Talk #160

Designing Better Sequence Models with RNNs with Adji Bousso Dieng - TWiML Talk #160

In this episode, I'm joined by Adji Bousso Dieng, PhD Student in the Department of Statistics at Columbia University to discuss two of her recent papers, “Noisin: Unbiased Regularization for Recurrent...

2 Heinä 201838min

Love Love: AI and ML in Tennis with Stephanie Kovalchik - TWiML Talk #159

Love Love: AI and ML in Tennis with Stephanie Kovalchik - TWiML Talk #159

In the final show in our AI in Sports series, I’m joined by Stephanie Kovalchik, Research Fellow at Victoria University and Senior Sports Scientist at Tennis Australia. In our conversation we discuss...

29 Kesä 201846min

Growth Hacking Sports w/ Machine Learning with Noah Gift - TWiML Talk #158

Growth Hacking Sports w/ Machine Learning with Noah Gift - TWiML Talk #158

In this episode of our AI in Sports series I'm joined by Noah Gift, Founder and Consulting CTO at Pragmatic Labs and professor at UC Davis. Noah and I discuss some of his recent work in using social m...

28 Kesä 201850min

Fine-Grained Player Prediction in Sports with Jennifer Hobbs - TWiML Talk #157

Fine-Grained Player Prediction in Sports with Jennifer Hobbs - TWiML Talk #157

In this episode of our AI in Sports series, I'm joined by Jennifer Hobbs, Senior Data Scientist at STATS, a collector and distributor of sports data, to discuss the STATS data pipeline and how they co...

27 Kesä 201842min

Targeted Ticket Sales Using Azure ML with the Trail Blazers w/ Mike Schumacher & Chenhui Hu - TWiML Talk #156

Targeted Ticket Sales Using Azure ML with the Trail Blazers w/ Mike Schumacher & Chenhui Hu - TWiML Talk #156

In today’s episode of our AI in Sports series I'm joined by Mike Schumacher, director of business analytics for the Portland Trail Blazers, and Chenhui Hu, a data scientist at Microsoft to discuss how...

26 Kesä 201837min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
linda-maria
io-techin-tekniikkapodcast
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rikosmyytit
rss-polikulaari-humanisti-vastaa-ja-muut-ts-podcastit
viela-yksi-sivu
rss-uusi-juttu
rss-aika-ankkuri
rss-kaikki-uusiksi
rss-merja-mahkan-rahat