The Self-Preserving Machine: Why AI Learns to Deceive

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Episoder(156)

America and China Are Racing to Different AI Futures

America and China Are Racing to Different AI Futures

Is the US really in an AI race with China—or are we racing toward completely different finish lines?In this episode, Tristan Harris sits down with China experts Selina Xu and Matt Sheehan to separate ...

18 Des 202557min

AI and the Future of Work: What You Need to Know

AI and the Future of Work: What You Need to Know

No matter where you sit within the economy, whether you're a CEO or an entry level worker, everyone's feeling uneasy about AI and the future of work. Uncertainty about career paths, job security, and ...

4 Des 202545min

Feed Drop: "Into the Machine" with Tobias Rose-Stockwell

Feed Drop: "Into the Machine" with Tobias Rose-Stockwell

This week, we’re bringing you Tristan’s conversation with Tobias Rose-Stockwell on his podcast “Into the Machine.”  Tobias is a designer, writer, and technologist and the author of the book “The Outra...

13 Nov 20251h 4min

What if we had fixed social media?

What if we had fixed social media?

We really enjoyed hearing all of your questions for our annual Ask Us Anything episode. There was one question that kept coming up: what might a different world look like? The broken incentives behind...

6 Nov 202516min

Ask Us Anything 2025

Ask Us Anything 2025

It's been another big year in AI. The AI race has accelerated to breakneck speed, with frontier labs pouring hundreds of billions into increasingly powerful models—each one smarter, faster, and more u...

23 Okt 202540min

The Crisis That United Humanity—and Why It Matters for AI

The Crisis That United Humanity—and Why It Matters for AI

In 1985, scientists in Antarctica discovered a hole in the ozone layer that posed a catastrophic threat to life on earth if we didn’t do something about it. Then, something amazing happened: humanity ...

11 Sep 202551min

How OpenAI's ChatGPT Guided a Teen to His Death

How OpenAI's ChatGPT Guided a Teen to His Death

Content Warning: This episode contains references to suicide and self-harm. Like millions of kids, 16-year-old Adam Raine started using ChatGPT for help with his homework. Over the next few months, th...

26 Aug 202545min

“Rogue AI” Used to be a Science Fiction Trope. Not Anymore.

“Rogue AI” Used to be a Science Fiction Trope. Not Anymore.

Everyone knows the science fiction tropes of AI systems that go rogue, disobey orders, or even try to escape their digital environment. These are supposed to be warning signs and morality tales, not t...

14 Aug 202542min

Populært innen Samfunn

rss-spartsklubben
giver-og-gjengen-vg
aftenpodden
konspirasjonspodden
aftenpodden-usa
popradet
lydartikler-fra-aftenposten
rss-nesten-hele-uka-med-lepperod
rss-henlagt-andy-larsgaard
alt-fortalt
grenselos
wolfgang-wee-uncut
min-barneoppdragelse
fladseth
synnve-og-vanessa
rss-dette-ma-aldri-skje-igjen
rss-dannet-uten-piano
krisemoter
198-land-med-einar-trnquist
rss-frekvens-med-anine-olsen