The Self-Preserving Machine: Why AI Learns to Deceive

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jaksot(158)

Spotlight: AI Myths and Misconceptions

Spotlight: AI Myths and Misconceptions

A few episodes back, we presented Tristan Harris and Aza Raskin’s talk The AI Dilemma. People inside the companies that are building generative artificial intelligence came to us with their concerns a...

11 Touko 202326min

Talking With Animals… Using AI

Talking With Animals… Using AI

Despite our serious concerns about the pace of deployment of generative artificial intelligence, we are not anti-AI. There are uses that can help us better understand ourselves and the world around us...

4 Touko 202324min

Can We Govern AI?

Can We Govern AI?

When it comes to AI, what kind of regulations might we need to address this rapidly developing new class of technologies? What makes regulating AI and runaway tech in general different from regulating...

21 Huhti 202339min

Spotlight: The Three Rules of Humane Tech

Spotlight: The Three Rules of Humane Tech

In our previous episode, we shared a presentation Tristan and Aza recently delivered to a group of influential technologists about the race happening in AI. In that talk, they introduced the Three Rul...

6 Huhti 202322min

The AI Dilemma

The AI Dilemma

You may have heard about the arrival of GPT-4, OpenAI’s latest large language model (LLM) release. GPT-4 surpasses its predecessor in terms of reliability, creativity, and ability to process intricate...

24 Maalis 202342min

TikTok’s Transparency Problem

TikTok’s Transparency Problem

A few months ago on Your Undivided Attention, we released a Spotlight episode on TikTok's national security risks. Since then, we've learned more about the dangers of the China-owned company: We've se...

2 Maalis 202337min

Synthetic Humanity: AI & What’s At Stake

Synthetic Humanity: AI & What’s At Stake

It may seem like the rise of artificial intelligence, and increasingly powerful large language models you may have heard of, is moving really fast… and it IS. But what’s coming next is when we enter s...

16 Helmi 202346min

The Race to Cooperation

The Race to Cooperation

It’s easy to tell ourselves we’re living in the world we want – one where Darwinian evolution drives competing technology platforms and capitalism pushes nations to maximize GDP regardless of external...

2 Helmi 202334min

Suosittua kategoriassa Yhteiskunta

sita
olipa-kerran-otsikko
kaksi-aitia
siita-on-vaikea-puhua
ihme-ja-kumma
i-dont-like-mondays
gogin-ja-janin-maailmanhistoria
uutiscast
poks
antin-palautepalvelu
kolme-kaannekohtaa
rss-murhan-anatomia
yopuolen-tarinoita-2
mamma-mia
rss-nikotellen
aikalisa
meidan-pitais-puhua
loukussa
lahko
terapeuttiville-qa