Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Jaksot(778)

Peering into the Home w/ Aerial.ai's Wifi Motion Analytics - TWiML Talk #107

Peering into the Home w/ Aerial.ai's Wifi Motion Analytics - TWiML Talk #107

In this episode I’m joined by Michel Allegue and Negar Ghourchian of Aerial.ai. Aerial is doing some really interesting things in the home automation space, by using wifi signal statistics to identify...

2 Helmi 201840min

Physiology-Based Models for Fitness and Training w/ Firstbeat with Ilkka Korhonen - TWiML Talk #106

Physiology-Based Models for Fitness and Training w/ Firstbeat with Ilkka Korhonen - TWiML Talk #106

In this episode i'm joined by Ilkka Korhonen, Vice President of Technology at Firstbeat, a company whose algorithms are embedded in fitness watches from companies like Garmin and Suunto and which use ...

2 Helmi 201835min

Machine Learning for Signal Processing Applications w/ Stuart Feffer & Brady Tsai - TWiML Talk #105

Machine Learning for Signal Processing Applications w/ Stuart Feffer & Brady Tsai - TWiML Talk #105

In this episode, I'm joined by Stuart Feffer, co-founder and CEO of Reality AI, which provides tools and services for engineers working with sensors and signals, and Brady Tsai, Business Development M...

1 Helmi 201836min

Personalizing the Ferrari Challenge Experience w/ Intel AI - TWiML Talk #104

Personalizing the Ferrari Challenge Experience w/ Intel AI - TWiML Talk #104

In this episode, I'm joined by Andy Keller and Emile Chin-Dickey to discuss Intel's partnership with the Ferrari Challenge North American Series. Andy is a Deep Learning Data Scientist at Intel and Em...

31 Tammi 201837min

Deep Learning for 3D Sensors and Cameras in Lighthouse with Alex Teichman - TWiML Talk #103

Deep Learning for 3D Sensors and Cameras in Lighthouse with Alex Teichman - TWiML Talk #103

In this episode, I sit down with Alex Teichman, CEO and Co-Founder of Lighthouse, a company taking a new approach to the in-home smart camera. Alex and I dig into what exactly the Lighthouse product i...

30 Tammi 201842min

Computer Vision for Cozmo, the Cutest Toy Robot Everrrrr! with Andrew Stein - TWiML Talk #102

Computer Vision for Cozmo, the Cutest Toy Robot Everrrrr! with Andrew Stein - TWiML Talk #102

In this episode, I'm joined by Andrew Stein, computer vision engineer at consumer robotics company Anki, and his partner in crime Cozmo, a toy robot with tons of personality. Andrew joined me during t...

30 Tammi 201843min

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY! w/ Inmar Givoni - Talk #101

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY! w/ Inmar Givoni - Talk #101

In this episode i'm joined by Inmar Givoni, Autonomy Engineering Manager at Uber ATG, to discuss her work on the paper Min-Max Propagation, which was presented at NIPS last month in Long Beach. Inmar ...

26 Tammi 201848min

A Linear-Time Kernel Goodness-of-Fit Test - NIPS Best Paper '17 - TWiML Talk #100

A Linear-Time Kernel Goodness-of-Fit Test - NIPS Best Paper '17 - TWiML Talk #100

In this episode, I speak with Arthur Gretton, Wittawat Jitkrittum, Zoltan Szabo and Kenji Fukumizu, who, alongside Wenkai Xu authored the 2017 NIPS Best Paper Award winner “A Linear-Time Kernel Goodne...

24 Tammi 201822min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
tervo-halme
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
rss-vaalirankkurit-podcast
otetaan-yhdet
et-sa-noin-voi-sanoo-esittaa
rss-podme-livebox
io-techin-tekniikkapodcast
linda-maria
rikosmyytit
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rss-asiastudio
the-ulkopolitist
rss-uusi-juttu
rss-fi-lainsaadanto-paremmaksi
rss-hyvaa-huomenta-bryssel
rss-50100-podcast