Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663

Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663

Today we’re joined by Markus Nagel, research scientist at Qualcomm AI Research, who helps us kick off our coverage of NeurIPS 2023. In our conversation with Markus, we cover his accepted papers at the conference, along with other work presented by Qualcomm AI Research scientists. Markus’ first paper, Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing, focuses on tackling activation quantization issues introduced by the attention mechanism and how to solve them. We also discuss Pruning vs Quantization: Which is Better?, which focuses on comparing the effectiveness of these two methods in achieving model weight compression. Additional papers discussed focus on topics like using scalarization in multitask and multidomain learning to improve training and inference, using diffusion models for a sequence of state models and actions, applying geometric algebra with equivariance to transformers, and applying a deductive verification of chain of thought reasoning performed by LLMs. The complete show notes for this episode can be found at twimlai.com/go/663.

Avsnitt(779)

Are Large Language Models a Path to AGI? with Ben Goertzel - #625

Are Large Language Models a Path to AGI? with Ben Goertzel - #625

Today we’re joined by Ben Goertzel, CEO of SingularityNET. In our conversation with Ben, we explore all things AGI, including the potential scenarios that could arise with the advent of AGI and his pr...

17 Apr 202359min

Open Source Generative AI at Hugging Face with Jeff Boudier - #624

Open Source Generative AI at Hugging Face with Jeff Boudier - #624

Today we’re joined by Jeff Boudier, head of product at Hugging Face 🤗. In our conversation with Jeff, we explore the current landscape of open-source machine learning tools and models, the recent shi...

11 Apr 202333min

Generative AI at the Edge with Vinesh Sukumar - #623

Generative AI at the Edge with Vinesh Sukumar - #623

Today we’re joined by Vinesh Sukumar, a senior director and head of AI/ML product management at Qualcomm Technologies. In our conversation with Vinesh, we explore how mobile and automotive devices hav...

3 Apr 202339min

Runway Gen-2: Generative AI for Video Creation with Anastasis Germanidis - #622

Runway Gen-2: Generative AI for Video Creation with Anastasis Germanidis - #622

Today we’re joined by Anastasis Germanidis, Co-Founder and CTO of RunwayML. Amongst all the product and model releases over the past few months, Runway threw its hat into the ring with Gen-1, a model ...

27 Mars 202349min

Watermarking Large Language Models to Fight Plagiarism with Tom Goldstein - 621

Watermarking Large Language Models to Fight Plagiarism with Tom Goldstein - 621

Today we’re joined by Tom Goldstein, an associate professor at the University of Maryland. Tom’s research sits at the intersection of ML and optimization and has previously been featured in the New Yo...

20 Mars 202351min

Does ChatGPT “Think”? A Cognitive Neuroscience Perspective with Anna Ivanova - #620

Does ChatGPT “Think”? A Cognitive Neuroscience Perspective with Anna Ivanova - #620

Today we’re joined by Anna Ivanova, a postdoctoral researcher at MIT Quest for Intelligence. In our conversation with Anna, we discuss her recent paper Dissociating language and thought in large langu...

13 Mars 202345min

Robotic Dexterity and Collaboration with Monroe Kennedy III - #619

Robotic Dexterity and Collaboration with Monroe Kennedy III - #619

Today we’re joined by Monroe Kennedy III, an assistant professor at Stanford, director of the Assistive Robotics and Manipulation Lab, and a national director of Black in Robotics. In our conversation...

6 Mars 202352min

Privacy and Security for Stable Diffusion and LLMs with Nicholas Carlini - #618

Privacy and Security for Stable Diffusion and LLMs with Nicholas Carlini - #618

Today we’re joined by Nicholas Carlini, a research scientist at Google Brain. Nicholas works at the intersection of machine learning and computer security, and his recent paper “Extracting Training Da...

27 Feb 202343min

Populärt inom Politik & nyheter

motiv
p3-krim
spar
svenska-fall
flashback-forever
rss-krimstad
rss-viva-fotboll
rss-sanning-konsekvens
aftonbladet-daily
aftonbladet-krim
rss-vad-fan-hande
rss-krimreportrarna
olyckan-inifran
rss-frandfors-horna
fordomspodden
dagens-eko
rss-flodet
svd-ledarredaktionen
politiken
rss-aftonbladet-krim