The Self-Preserving Machine: Why AI Learns to Deceive

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Episoder(157)

The Tech-God Complex: Why We Need to be Skeptics

The Tech-God Complex: Why We Need to be Skeptics

Silicon Valley's interest in AI is driven by more than just profit and innovation. There’s an unmistakable mystical quality to it as well. In this episode, Daniel and Aza sit down with humanist chapla...

21 Nov 202446min

What Can We Do About Abusive Chatbots? With Meetali Jain and Camille Carlton

What Can We Do About Abusive Chatbots? With Meetali Jain and Camille Carlton

CW: This episode features discussion of suicide and sexual abuse. In the last episode, we had the journalist Laurie Segall on to talk about the tragic story of Sewell Setzer, a 14 year old boy who too...

7 Nov 202448min

When the "Person" Abusing Your Child is a Chatbot: The Tragic Story of Sewell Setzer

When the "Person" Abusing Your Child is a Chatbot: The Tragic Story of Sewell Setzer

Content Warning: This episode contains references to suicide, self-harm, and sexual abuse.Megan Garcia lost her son Sewell to suicide after he was abused and manipulated by AI chatbots for months. Now...

24 Okt 202449min

Is It AI? One Tool to Tell What’s Real with Truemedia.org CEO Oren Etzioni

Is It AI? One Tool to Tell What’s Real with Truemedia.org CEO Oren Etzioni

Social media disinformation did enormous damage to our shared idea of reality. Now, the rise of generative AI has unleashed a flood of high-quality synthetic media into the digital ecosystem. As a res...

10 Okt 202425min

'A Turning Point in History': Yuval Noah Harari on AI’s Cultural Takeover

'A Turning Point in History': Yuval Noah Harari on AI’s Cultural Takeover

Historian Yuval Noah Harari says that we are at a critical turning point. One in which AI’s ability to generate cultural artifacts threatens humanity’s role as the shapers of history. History will sti...

7 Okt 20241h 30min

‘We Have to Get It Right’: Gary Marcus On Untamed AI

‘We Have to Get It Right’: Gary Marcus On Untamed AI

It’s a confusing moment in AI. Depending on who you ask, we’re either on the fast track to AI that’s smarter than most humans, or the technology is about to hit a wall. Gary Marcus is in the latter ca...

26 Sep 202441min

AI Is Moving Fast. We Need Laws that Will Too.

AI Is Moving Fast. We Need Laws that Will Too.

AI is moving fast. And as companies race to rollout newer, more capable models–with little regard for safety–the downstream risks of those models become harder and harder to counter. On this week’s ep...

13 Sep 202439min

Esther Perel on Artificial Intimacy (rerun)

Esther Perel on Artificial Intimacy (rerun)

[This episode originally aired on August 17, 2023] For all the talk about AI, we rarely hear about how it will change our relationships. As we swipe to find love and consult chatbot therapists, acclai...

6 Sep 202444min

Populært innen Samfunn

rss-spartsklubben
giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
konspirasjonspodden
popradet
rss-henlagt-andy-larsgaard
rss-nesten-hele-uka-med-lepperod
wolfgang-wee-uncut
grenselos
lydartikler-fra-aftenposten
min-barneoppdragelse
synnve-og-vanessa
rss-dette-ma-aldri-skje-igjen
rss-dannet-uten-piano
alt-fortalt
fladseth
198-land-med-einar-trnquist
opptur-med-annette-og-ingeborg
frokostshowet-pa-p5