Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in general and Albert’s recent Mamba and Mamba-2 papers in particular. We dig into the efficiency of the attention mechanism and its limitations in handling high-resolution perceptual modalities, and the strengths and weaknesses of transformer architectures relative to alternatives for various tasks. We dig into the role of tokenization and patching in transformer pipelines, emphasizing how abstraction and semantic relationships between tokens underpin the model's effectiveness, and explore how this relates to the debate between handcrafted pipelines versus end-to-end architectures in machine learning. Additionally, we touch on the evolving landscape of hybrid models which incorporate elements of attention and state, the significance of state update mechanisms in model adaptability and learning efficiency, and the contribution and adoption of state-space models like Mamba and Mamba-2 in academia and industry. Lastly, Albert shares his vision for advancing foundation models across diverse modalities and applications. The complete show notes for this episode can be found at https://twimlai.com/go/693.

Episoder(781)

Engineering the Future of AI with Ruchir Puri - TWiML Talk #21

Engineering the Future of AI with Ruchir Puri - TWiML Talk #21

Today we bring you the second of three interviews we did backstage from the NYU FutureLabs AI Summit, this time with Ruchir Puri. Ruchir is the Chief Architect at IBM Watson as well as an IBM Fellow. ...

28 Apr 201720min

Selling AI to the Enterprise with Kathryn Hume - TWiML Talk #20

Selling AI to the Enterprise with Kathryn Hume - TWiML Talk #20

This week's guest is Kathryn Hume. Kathryn is the President of Fast Forward Labs, which is an independent machine intelligence research company that helps organizations accelerate their data science a...

21 Apr 201723min

From Particle Physics to Audio AI with Scott Stephenson - TWiML Talk #19

From Particle Physics to Audio AI with Scott Stephenson - TWiML Talk #19

This week my guest is Scott Stephenson. Scott is co-Founder & CEO of Deepgram, which has developed an AI-based platform for indexing and searching audio and video. Scott and I cover a ton of interesti...

14 Apr 201756min

(5/5) AlphaVertex - Creating a Worldwide Financial Knowledge Graph - TWiML Talk #18

(5/5) AlphaVertex - Creating a Worldwide Financial Knowledge Graph - TWiML Talk #18

This week I'm on location at NYU/ffVC AI NexusLab startup accelerator, speaking with founders from the 5 companies in the program's inaugural batch. This interview is with AlphaVertex, a FinTech start...

7 Apr 201726min

(4/5) Behold.ai - Increasing Efficiency of Healthcare Insurance Billing with NLP - TWiML Talk #18

(4/5) Behold.ai - Increasing Efficiency of Healthcare Insurance Billing with NLP - TWiML Talk #18

This week I'm on location at NYU/ffVC AI NexusLab startup accelerator, speaking with founders from the 5 companies in the program's inaugural batch. This interview is with Behold.ai, which uses comput...

7 Apr 201716min

(3/5) Cambrian Intelligence - Using AI to Simplify the Programming of Robots - TWiML Talk #18

(3/5) Cambrian Intelligence - Using AI to Simplify the Programming of Robots - TWiML Talk #18

This week I'm on location at NYU/ffVC AI NexusLab startup accelerator, speaking with founders from the 5 companies in the program's inaugural batch. This interview is with Cambrian Intelligence, a com...

7 Apr 201723min

(2/5) Klustera - Location-Based Intelligence for Smarter Marketing - TWiML Talk #18

(2/5) Klustera - Location-Based Intelligence for Smarter Marketing - TWiML Talk #18

This week I'm on location at NYU/ffVC AI NexusLab startup accelerator, speaking with founders from the 5 companies in the program's inaugural batch. This interview is with Klustera, a company applying...

7 Apr 201722min

(1/5) HelloVera - AI-Powered Customer Support  - TWiML Talk #18

(1/5) HelloVera - AI-Powered Customer Support - TWiML Talk #18

This week I'm on location at NYU/ffVC AI NexusLab startup accelerator, speaking with founders from the 5 companies in the program's inaugural batch. This interview is with HelloVera, a company applyin...

7 Apr 201725min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
forklart
i-retten
popradet
stopp-verden
aftenpodden-usa
lydartikler-fra-aftenposten
rss-gukild-johaug
det-store-bildet
fotballpodden-2
dine-penger-pengeradet
nokon-ma-ga
rss-ness
hanna-de-heldige
aftenbla-bla
frokostshowet-pa-p5
rss-penger-polser-og-politikk
e24-podden
rss-utenrikskomiteen-med-bogen-og-grasvik