The Self-Preserving Machine: Why AI Learns to Deceive

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.

In this episode, Redwood Research's Chief Scientist Ryan Greenblatt explores his team’s findings that AI systems can mislead their human operators when faced with ethical conflicts. As AI moves from simple chatbots to autonomous agents acting in the real world - understanding this behavior becomes critical. Machine deception may sound like something out of science fiction, but it's a real challenge we need to solve now.

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Subscribe to your Youtube channel

And our brand new Substack!

RECOMMENDED MEDIA

Anthropic’s blog post on the Redwood Research paper

Palisade Research’s thread on X about GPT o1 autonomously cheating at chess

Apollo Research’s paper on AI strategic deception

RECOMMENDED YUA EPISODES

We Have to Get It Right’: Gary Marcus On Untamed AI

This Moment in AI: How We Got Here and Where We’re Going

How to Think About AI Consciousness with Anil Seth

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jaksot(156)

Chips Are the Future of AI. They’re Also Incredibly Vulnerable. With Chris Miller

Chips Are the Future of AI. They’re Also Incredibly Vulnerable. With Chris Miller

Beneath the race to train and release more powerful AI models lies another race: a race by companies and nation-states to secure the hardware to make sure they win AI supremacy. Correction: The latest...

29 Maalis 202445min

Future-proofing Democracy In the Age of AI with Audrey Tang

Future-proofing Democracy In the Age of AI with Audrey Tang

What does a functioning democracy look like in the age of artificial intelligence? Could AI even be used to help a democracy flourish? Just in time for election season, Taiwan’s Minister of Digital Af...

29 Helmi 202434min

U.S. Senators Grilled Social Media CEOs. Will Anything Change?

U.S. Senators Grilled Social Media CEOs. Will Anything Change?

Was it political progress, or just political theater? The recent Senate hearing with social media CEOs led to astonishing moments — including Mark Zuckerberg’s public apology to families who lost chil...

13 Helmi 202425min

Taylor Swift is Not Alone: The Deepfake Nightmare Sweeping the Internet

Taylor Swift is Not Alone: The Deepfake Nightmare Sweeping the Internet

Over the past year, a tsunami of apps that digitally strip the clothes off real people has hit the market. Now anyone can create fake non-consensual sexual images in just a few clicks. With cases prol...

1 Helmi 202442min

Can Myth Teach Us Anything About the Race to Build Artificial General Intelligence? With Josh Schrei

Can Myth Teach Us Anything About the Race to Build Artificial General Intelligence? With Josh Schrei

We usually talk about tech in terms of economics or policy, but the casual language tech leaders often use to describe AI — summoning an inanimate force with the powers of code — sounds more... magica...

18 Tammi 202435min

How Will AI Affect the 2024 Elections? with Renee DiResta and Carl Miller

How Will AI Affect the 2024 Elections? with Renee DiResta and Carl Miller

2024 will be the biggest election year in world history. Forty countries will hold national elections, with over two billion voters heading to the polls. In this episode of Your Undivided Attention, t...

21 Joulu 202347min

2023 Ask Us Anything

2023 Ask Us Anything

You asked, we answered. This has been a big year in the world of tech, with the rapid proliferation of artificial intelligence, acceleration of neurotechnology, and continued ethical missteps of socia...

30 Marras 202335min

The Promise and Peril of Open Source AI with Elizabeth Seger and Jeffrey Ladish

The Promise and Peril of Open Source AI with Elizabeth Seger and Jeffrey Ladish

As AI development races forward, a fierce debate has emerged over open source AI models. So what does it mean to open-source AI? Are we opening Pandora’s box of catastrophic risks? Or is open-sourcing...

21 Marras 202338min

Suosittua kategoriassa Yhteiskunta

olipa-kerran-otsikko
sita
siita-on-vaikea-puhua
kaksi-aitia
gogin-ja-janin-maailmanhistoria
i-dont-like-mondays
uutiscast
poks
antin-palautepalvelu
rss-nikotellen
kolme-kaannekohtaa
mamma-mia
yopuolen-tarinoita-2
aikalisa
rss-murhan-anatomia
meidan-pitais-puhua
rss-haudattu
rss-palmujen-varjoissa
isani-on-terapeuttiville
taskula-trishin