The Self-Preserving Machine: Why AI Learns to Deceive

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jaksot(156)

Social Media Victims Lawyer Up with Laura Marquez-Garrett

Social Media Victims Lawyer Up with Laura Marquez-Garrett

Social media was humanity’s ‘first contact’ moment with AI. If we’re going to create laws that are strong enough to prevent AI from destroying our societies, we could benefit from taking a look at the...

21 Heinä 202334min

Big Food, Big Tech and Big AI with Michael Moss

Big Food, Big Tech and Big AI with Michael Moss

In the next two episodes of Your Undivided Attention, we take a close look at two respective industries: big food and social media, which represent dangerous “races to the bottom” and have big paralle...

6 Heinä 202334min

What Can Technologists Learn from Sesame Street? With Dr. Rosemarie Truglio

What Can Technologists Learn from Sesame Street? With Dr. Rosemarie Truglio

What happens when creators consider what lifelong human development looks like in terms of the tools we make? And what philosophies from Sesame Street can inform how to steward the power of AI and soc...

22 Kesä 202329min

Spotlight: How Zombie Values Infect Society

Spotlight: How Zombie Values Infect Society

You’re likely familiar with the modern zombie trope: a zombie bites someone you care about and they’re transformed into a creature who wants your brain. Zombies are the perfect metaphor to explain som...

8 Kesä 202322min

Feed Drop: AI Doomsday with Kara Swisher

Feed Drop: AI Doomsday with Kara Swisher

There’s really no one better than veteran tech journalist Kara Swisher at challenging people to articulate their thinking. Tristan Harrris recently sat down with her for a wide ranging interview on AI...

2 Kesä 202355min

The Tech We Need for 21st Century Democracy with Divya Siddarth

The Tech We Need for 21st Century Democracy with Divya Siddarth

Democracy in action has looked the same for generations. Constituents might go to a library or school every one or two years and cast their vote for people who don't actually represent everything that...

25 Touko 202338min

Spotlight: AI Myths and Misconceptions

Spotlight: AI Myths and Misconceptions

A few episodes back, we presented Tristan Harris and Aza Raskin’s talk The AI Dilemma. People inside the companies that are building generative artificial intelligence came to us with their concerns a...

11 Touko 202326min

Talking With Animals… Using AI

Talking With Animals… Using AI

Despite our serious concerns about the pace of deployment of generative artificial intelligence, we are not anti-AI. There are uses that can help us better understand ourselves and the world around us...

4 Touko 202324min

Suosittua kategoriassa Yhteiskunta

olipa-kerran-otsikko
sita
siita-on-vaikea-puhua
gogin-ja-janin-maailmanhistoria
i-dont-like-mondays
kaksi-aitia
uutiscast
poks
antin-palautepalvelu
rss-nikotellen
kolme-kaannekohtaa
mamma-mia
yopuolen-tarinoita-2
rss-murhan-anatomia
aikalisa
meidan-pitais-puhua
rss-haudattu
rss-palmujen-varjoissa
isani-on-terapeuttiville
mystista