Your Undivided Attention30 Jan 2025

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oppdag Premium

Prøv 14 dager gratis

Kjøp Premium

Episoder(156)

Here’s Our Roadmap to a Better AI Future

In order to shift the incentives of AI — the trillions of dollars in investment, the race to geopolitical power and dominance — it’s not enough to simply understand the problem, we need real action. ...

2 Apr 52min

Why the Meta Verdicts Are a Big Deal (And What It Was Like to Testify)

In two landmark cases, juries in California and New Mexico found Meta and Google liable for creating addictive, harmful products and failing to protect children from exploitation and abuse. These verd...

26 Mar 19min

A Conversation with the Team Behind "The AI Doc"

“The AI Doc: Or How I Became An Apocaloptimist” opens in theaters across the U.S. this Friday, March 27. In this episode, we sit down with the team behind this groundbreaking documentary — Oscar-winni...

23 Mar 47min

AI Is Breaking Education. Rebecca Winthrop Has the Blueprint to Fix It.

The promise of AI in education is incredible: picture infinitely patient tutors that can teach every student exactly the way they need to be taught. But the history of education technology tells us th...

5 Mar 46min

The Race to Build God: AI's Existential Gamble — Yoshua Bengio & Tristan Harris at Davos

This week on Your Undivided Attention, Tristan Harris and Daniel Barcay offer a backstage recap of what it was like to be at the Davos World Economic Forum meeting this year as the world’s power broke...

19 Feb 37min

FEED DROP: Possible with Reid Hoffman and Aria Finger

This week on Your Undivided Attention, we’re bringing you Aza Raskin’s conversation with Reid Hoffman and Aria Finger on their podcast “Possible”. Reid and Aria are both tech entrepreneurs: Reid is th...

5 Feb 1h 7min

Attachment Hacking and the Rise of AI Psychosis

Therapy and companionship has become the #1 use case for AI, with millions worldwide sharing their innermost thoughts with AI systems — often things they wouldn't tell loved ones or human therapists. ...

21 Jan 50min

What Would It Take to Actually Trust Each Other? The Game Theory Dilemma

So much of our world today can be summed up in the cold logic of “if I don’t, they will.” This is the foundation of game theory, which holds that cooperation and virtue are irrational; that all that m...

8 Jan 45min

Reklamefrie Premium-podkaster

Hør populære podkaster som Storefri med Mikkel og Herman, Ida med hjertet i hånden, Krimpodden og mye mye mer

Skap din egen podkastboble

I appen skaper du ditt eget bibliotek med favoritter, og vi gir deg også anbefalinger til podkaster du ikke kan gå glipp av.

Prøv 14 dager gratis

Dersom du er ny Podme-bruker får du 14 dager gratis prøveperiode når du oppretter abonnement

Premium

99 kr/ måned

Tilgang til alle våre Premium-podkaster
Alle podkaster fra VG, Aftenposten, BT og SA
Reklamefritt Premium-innhold
Ingen bindingstid. Avslutt når du ønsker

Prøv 14 dager gratis

Premium

129 kr/ måned

Tilgang til alle Premium-podkaster
Alle podkaster fra VG, Aftenposten, BT og SA
Reklamefritt Premium-innhold
Ingen bindingstid. Avslutt når du ønsker
En Ekstra bruker

Prøv 14 dager gratis

Populært innen Samfunn

Historiene og stemmene du vil høre

Ubegrenset tilgang til alle dine favorittpodkaster og lydbøker

Les mer