Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Avsnitt(778)

Peering into the Home w/ Aerial.ai's Wifi Motion Analytics - TWiML Talk #107

Peering into the Home w/ Aerial.ai's Wifi Motion Analytics - TWiML Talk #107

In this episode I’m joined by Michel Allegue and Negar Ghourchian of Aerial.ai. Aerial is doing some really interesting things in the home automation space, by using wifi signal statistics to identify...

2 Feb 201840min

Physiology-Based Models for Fitness and Training w/ Firstbeat with Ilkka Korhonen - TWiML Talk #106

Physiology-Based Models for Fitness and Training w/ Firstbeat with Ilkka Korhonen - TWiML Talk #106

In this episode i'm joined by Ilkka Korhonen, Vice President of Technology at Firstbeat, a company whose algorithms are embedded in fitness watches from companies like Garmin and Suunto and which use ...

2 Feb 201835min

Machine Learning for Signal Processing Applications w/ Stuart Feffer & Brady Tsai - TWiML Talk #105

Machine Learning for Signal Processing Applications w/ Stuart Feffer & Brady Tsai - TWiML Talk #105

In this episode, I'm joined by Stuart Feffer, co-founder and CEO of Reality AI, which provides tools and services for engineers working with sensors and signals, and Brady Tsai, Business Development M...

1 Feb 201836min

Personalizing the Ferrari Challenge Experience w/ Intel AI - TWiML Talk #104

Personalizing the Ferrari Challenge Experience w/ Intel AI - TWiML Talk #104

In this episode, I'm joined by Andy Keller and Emile Chin-Dickey to discuss Intel's partnership with the Ferrari Challenge North American Series. Andy is a Deep Learning Data Scientist at Intel and Em...

31 Jan 201837min

Deep Learning for 3D Sensors and Cameras in Lighthouse with Alex Teichman - TWiML Talk #103

Deep Learning for 3D Sensors and Cameras in Lighthouse with Alex Teichman - TWiML Talk #103

In this episode, I sit down with Alex Teichman, CEO and Co-Founder of Lighthouse, a company taking a new approach to the in-home smart camera. Alex and I dig into what exactly the Lighthouse product i...

30 Jan 201842min

Computer Vision for Cozmo, the Cutest Toy Robot Everrrrr! with Andrew Stein - TWiML Talk #102

Computer Vision for Cozmo, the Cutest Toy Robot Everrrrr! with Andrew Stein - TWiML Talk #102

In this episode, I'm joined by Andrew Stein, computer vision engineer at consumer robotics company Anki, and his partner in crime Cozmo, a toy robot with tons of personality. Andrew joined me during t...

30 Jan 201843min

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY! w/ Inmar Givoni - Talk #101

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY! w/ Inmar Givoni - Talk #101

In this episode i'm joined by Inmar Givoni, Autonomy Engineering Manager at Uber ATG, to discuss her work on the paper Min-Max Propagation, which was presented at NIPS last month in Long Beach. Inmar ...

26 Jan 201848min

A Linear-Time Kernel Goodness-of-Fit Test - NIPS Best Paper '17 - TWiML Talk #100

A Linear-Time Kernel Goodness-of-Fit Test - NIPS Best Paper '17 - TWiML Talk #100

In this episode, I speak with Arthur Gretton, Wittawat Jitkrittum, Zoltan Szabo and Kenji Fukumizu, who, alongside Wenkai Xu authored the 2017 NIPS Best Paper Award winner “A Linear-Time Kernel Goodne...

24 Jan 201822min

Populärt inom Politik & nyheter

p3-krim
rss-krimstad
svenska-fall
rss-viva-fotboll
flashback-forever
motiv
aftonbladet-daily
rss-vad-fan-hande
rss-sanning-konsekvens
aftonbladet-krim
rss-krimreportrarna
olyckan-inifran
rss-frandfors-horna
fordomspodden
dagens-eko
spar
rss-flodet
blenda-2
politiken
rss-klubbland-en-podd-mest-om-frolunda