The Self-Preserving Machine: Why AI Learns to Deceive

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jaksot(156)

Forever Chemicals, Forever Consequences: What PFAS Teaches Us About AI

Forever Chemicals, Forever Consequences: What PFAS Teaches Us About AI

Artificial intelligence is set to unleash an explosion of new technologies and discoveries into the world. This could lead to incredible advances in human flourishing, if we do it well. The problem? W...

3 Huhti 20251h 4min

Weaponizing Uncertainty: How Tech is Recycling Big Tobacco’s Playbook

Weaponizing Uncertainty: How Tech is Recycling Big Tobacco’s Playbook

One of the hardest parts about being human today is navigating uncertainty. When we see experts battling in public and emotions running high, it's easy to doubt what we once felt certain about. This u...

20 Maalis 202551min

The Man Who Predicted the Downfall of Thinking

The Man Who Predicted the Downfall of Thinking

Few thinkers were as prescient about the role technology would play in our society as the late, great Neil Postman. Forty years ago, Postman warned about all the ways modern communication technology w...

6 Maalis 202558min

Behind the DeepSeek Hype, AI is Learning to Reason

Behind the DeepSeek Hype, AI is Learning to Reason

When Chinese AI company DeepSeek announced they had built a model that could compete with OpenAI at a fraction of the cost, it sent shockwaves through the industry and roiled global markets. But amid ...

20 Helmi 202531min

Laughing at Power: A Troublemaker’s Guide to Changing Tech

Laughing at Power: A Troublemaker’s Guide to Changing Tech

The status quo of tech today is untenable: we’re addicted to our devices, we’ve become increasingly polarized, our mental health is suffering and our personal data is sold to the highest bidder. This ...

16 Tammi 202545min

Ask Us Anything 2024

Ask Us Anything 2024

2024 was a critical year in both AI and social media. Things moved so fast it was hard to keep up. So our hosts reached into their mailbag to answer some of your most burning questions. Thank you so m...

19 Joulu 202440min

The Tech-God Complex: Why We Need to be Skeptics

The Tech-God Complex: Why We Need to be Skeptics

Silicon Valley's interest in AI is driven by more than just profit and innovation. There’s an unmistakable mystical quality to it as well. In this episode, Daniel and Aza sit down with humanist chapla...

21 Marras 202446min

Suosittua kategoriassa Yhteiskunta

olipa-kerran-otsikko
i-dont-like-mondays
sita
kaksi-aitia
siita-on-vaikea-puhua
gogin-ja-janin-maailmanhistoria
uutiscast
poks
antin-palautepalvelu
kolme-kaannekohtaa
rss-nikotellen
mamma-mia
yopuolen-tarinoita-2
aikalisa
rss-murhan-anatomia
rss-haudattu
meidan-pitais-puhua
rss-palmujen-varjoissa
taskula-trishin
isani-on-terapeuttiville