The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]

The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]

Beth Barnes and David Rein on the one graph that ate the AI timelines discourse, and why the two people who built it are the most careful about how you read it.**SPONSOR**Prolific - Quality data. From real people. For faster breakthroughs.https://www.prolific.com/?utm_source=mlstInterview: https://youtu.be/cnxZZTl1tkk---Beth Barnes and David Rein from METR on the one graph that ate the AI timelines discourse, and why the people who built it are the most careful about how it gets read.Beth founded METR after leaving OpenAI alignment. David is first author on GPQA and co-author on HCAST and the METR Time Horizons paper. Together they built the measurement Daniel Kokotajlo called the single most important piece of evidence on AI timelines: the log-linear line of "how long a task a frontier model can complete at 50% reliability" vs release date.The conversation opens on reward hacking. Current models can articulate in chat why a behaviour is undesired and then execute it anyway as agents. From there: construct validity, Melanie Mitchell's four-problem taxonomy, and the ARC-AGI 1-to-2 collapse as a worked example of adversarially-selected benchmarks regressing once labs target them. Beth's counter: METR deliberately does not adversarially select. David's: models do not have to do the right thing for the right reasons.Methodology, then specification — David's compiler analogy, Beth on four-month tasks as expensive to evaluate rather than unspecifiable. Then the SWE-bench reality check, the METR finding that half of passing PRs would not be merged, and Beth's horses-versus-bank-tellers analogy for the labour market.The close: monitorability, the coin-spinning boat, two-year recursive self-improvement, and Beth's line that "overhyped now" and "big deal later" are not correlated claims.---TIMESTAMPS:00:00:00 Intro00:02:06 Sponsor break: Prolific human-feedback infrastructure00:02:33 Welcome and the scalable oversight motivation00:06:02 Construct validity, benchmark pathologies and the Chollet worry00:15:45 Time Horizons: human time, HCAST tasks and the 50% logistic00:24:50 Is human difficulty really one variable?00:33:05 Agent harness evolution and the inference-compute dividend00:40:00 Scaffolding bells, token budgets and the credit-assignment problem00:44:15 Look at the damn graph: regularisation bug and reliability nuance00:50:00 Why 50%? Reliability, reward hacking and pizza-party transcripts00:55:20 Extrapolation risk and straight lines on graphs00:59:25 Software engineering as a specification acquisition problem01:07:40 Compilers also made ugly code: vibe-coding quality and Claude on METR Slack01:15:15 Strongest defensible claim, Carlini's compiler swarm and AI 202701:23:45 SWE-bench merge rates, the bank-teller analogy and horses01:31:45 Scheming, alignment faking and the mentalistic vocabulary problem01:40:45 Reward hacking, monitorability and chain-of-thought faithfulness01:45:25 Recursive self-improvement, knowledge vs intelligence and closing

ReScript: https://app.rescript.info/public/share/de3bb40cc02ee39fdf36e2c60366eb4d

(PDF, refs, transcript etc)

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(251)

Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)

Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)

Michael I. Jordan, described by Science magazine as the most influential computer scientist alive, has never thought of himself as an AI researcher. In this conversation he explains why that distincti...

21 Maj 1h 17min

When AI Discovers The Next Transformer - Robert Lange (Sakana)

When AI Discovers The Next Transformer - Robert Lange (Sakana)

Robert Lange, founding researcher at Sakana AI, joins Tim to discuss *Shinka Evolve* — a framework that combines LLMs with evolutionary algorithms to do open-ended program search. The core claim: syst...

13 Mars 1h 18min

"Vibe Coding is a Slot Machine" - Jeremy Howard

"Vibe Coding is a Slot Machine" - Jeremy Howard

Dive into the realities of AI-assisted coding, the origins of modern fine-tuning, and the cognitive science behind machine learning with fast.ai founder Jeremy Howard. In this episode, we unpack why A...

3 Mars 1h 26min

 Evolution "Doesn't Need" Mutation - Blaise Agüera y Arcas

Evolution "Doesn't Need" Mutation - Blaise Agüera y Arcas

What if life itself is just a really sophisticated computer program that wrote itself into existence?Blaise Agüera y Arcas presenting at ALife 2025 — the most technically detailed public walkthrough o...

16 Feb 55min

VAEs Are Energy-Based Models? [Dr. Jeff Beck]

VAEs Are Energy-Based Models? [Dr. Jeff Beck]

What makes something truly *intelligent?* Is a rock an agent? Could a perfect simulation of your brain actually *be* you? In this fascinating conversation, Dr. Jeff Beck takes us on a journey through ...

25 Jan 46min

Abstraction & Idealization: AI's Plato Problem [Mazviita Chirimuuta]

Abstraction & Idealization: AI's Plato Problem [Mazviita Chirimuuta]

Professor Mazviita Chirimuuta joins us for a fascinating deep dive into the philosophy of neuroscience and what it really means to understand the mind.*What can neuroscience actually tell us about how...

23 Jan 53min

Why Every Brain Metaphor in History Has Been Wrong [SPECIAL EDITION]

Why Every Brain Metaphor in History Has Been Wrong [SPECIAL EDITION]

What if everything we think we know about the brain is just a really good metaphor that we forgot was a metaphor?This episode takes you on a journey through the history of scientific simplification, f...

23 Jan 42min

Populärt inom Teknik

uppgang-och-fall
elbilsveckan
market-makers
bilar-med-sladd
rss-laddstationen-med-elbilen-i-sverige
rss-elektrikerpodden
developers-mer-an-bara-kod
natets-morka-sida
rss-veckans-ai
skogsforum-podcast
rss-technokratin
bosse-bildoktorn-och-hasse-p
under-femton
har-vi-akt-till-mars-an
ai-sweden-podcast
rss-uppgang-och-fall
rss-upplyst-entreprenordirektor
rss-bakom-boken
rss-powerboat-sverige-podcast
rss-hit-med-dina-lunchpengar