Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

In this episode, Emmanuel Ameisen, a research engineer at Anthropic, returns to discuss two recent papers: "Circuit Tracing: Revealing Language Model Computational Graphs" and "On the Biology of a Large Language Model." Emmanuel explains how his team developed mechanistic interpretability methods to understand the internal workings of Claude by replacing dense neural network components with sparse, interpretable alternatives. The conversation explores several fascinating discoveries about large language models, including how they plan ahead when writing poetry (selecting the rhyming word "rabbit" before crafting the sentence leading to it), perform mathematical calculations using unique algorithms, and process concepts across multiple languages using shared neural representations. Emmanuel details how the team can intervene in model behavior by manipulating specific neural pathways, revealing how concepts are distributed throughout the network's MLPs and attention mechanisms. The discussion highlights both capabilities and limitations of LLMs, showing how hallucinations occur through separate recognition and recall circuits, and demonstrates why chain-of-thought explanations aren't always faithful representations of the model's actual reasoning. This research ultimately supports Anthropic's safety strategy by providing a deeper understanding of how these AI systems actually work. The complete show notes for this episode can be found at https://twimlai.com/go/727.

Jaksot(781)

This Week in Machine Learning & AI - 6/17/16: Apple's New ML APIs, IBM Brings Deep Learning Thunder

This Week in Machine Learning & AI - 6/17/16: Apple's New ML APIs, IBM Brings Deep Learning Thunder

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week’s podcast digs into Apple's ML...

18 Kesä 201624min

This Week In Machine Learning & AI - 6/10/16: Self-Motivated AI, Plus A Kill-Switch for Rogue Bots

This Week In Machine Learning & AI - 6/10/16: Self-Motivated AI, Plus A Kill-Switch for Rogue Bots

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week’s podcast looks at new researc...

11 Kesä 201624min

This Week In Machine Learning & AI - 6/3/16: Facebook's DeepText, ML & Art, Artificial Assistants

This Week In Machine Learning & AI - 6/3/16: Facebook's DeepText, ML & Art, Artificial Assistants

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week’s podcast looks at Facebooks' ...

4 Kesä 201624min

This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars

This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars

This Week in Machine Learning & AI brings you the week's most interesting and important stories from the world of machine learning and artificial intelligence. This week's episode explores the White H...

28 Touko 201625min

This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE

This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE

This Week In Machine Learning & AI - May 20, 2016. Google I/O, deep learning hardware and an AI to save you from conference call hell.

21 Touko 201619min

Suosittua kategoriassa Politiikka ja uutiset

uutiscast
aikalisa
ootsa-kuullut-tasta-2
rss-ootsa-kuullut-tasta
politiikan-puskaradio
tervo-halme
rss-vaalirankkurit-podcast
viisupodi
rss-podme-livebox
et-sa-noin-voi-sanoo-esittaa
rss-asiastudio
otetaan-yhdet
rikosmyytit
the-ulkopolitist
linda-maria
radio-antro
rss-sanna-ukkola-show-verkkouutiset
io-techin-tekniikkapodcast
aihe
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset