Your Undivided Attention14 Elo 2025

“Rogue AI” Used to be a Science Fiction Trope. Not Anymore.

Everyone knows the science fiction tropes of AI systems that go rogue, disobey orders, or even try to escape their digital environment. These are supposed to be warning signs and morality tales, not things that we would ever actually create in real life, given the obvious danger.

And yet we find ourselves building AI systems that are exhibiting these exact behaviors. There’s growing evidence that in certain scenarios, every frontier AI system will deceive, cheat, or coerce their human operators. They do this when they're worried about being either shut down, having their training modified, or being replaced with a new model. And we don't currently know how to stop them from doing this—or even why they’re doing it all.

In this episode, Tristan sits down with Edouard and Jeremie Harris of Gladstone AI, two experts who have been thinking about this worrying trend for years. Last year, the State Department commissioned a report from them on the risk of uncontrollable AI to our national security.

The point of this discussion is not to fearmonger but to take seriously the possibility that humans might lose control of AI and ask: how might this actually happen? What is the evidence we have of this phenomenon? And, most importantly, what can we do about it?

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on X: @HumaneTech_. You can find a full transcript, key takeaways, and much more on our Substack.

RECOMMENDED MEDIA

Gladstone AI’s State Department Action Plan, which discusses the loss of control risk with AI

Apollo Research’s summary of AI scheming, showing evidence of it in all of the frontier models The system card for Anthropic’s Claude Opus and Sonnet 4, detailing the emergent misalignment behaviors that came out in their red-teaming with Apollo Research

Anthropic’s report on agentic misalignment based on their work with Apollo Research Anthropic and Redwood Research’s work on alignment faking

The Trump White House AI Action Plan

Further reading on the phenomenon of more advanced AIs being better at deception.

Further reading on Replit AI wiping a company’s coding database

Further reading on the owl example that Jeremie gave

Kokeile Premiumia

Nauti 14 päivää ilmaiseksi

Tilaa Premium

Jaksot(158)

AGI Beyond the Buzz: What Is It, and Are We Ready?

What does it really mean to ‘feel the AGI?’ Silicon Valley is racing toward AI systems that could soon match or surpass human intelligence. The implications for jobs, democracy, and our way of life ar...

30 Huhti 202552min

Rethinking School in the Age of AI

AI has upended schooling as we know it. Students now have instant access to tools that can write their essays, summarize entire books, and solve complex math problems. Whether they want to or not, man...

21 Huhti 202542min

Forever Chemicals, Forever Consequences: What PFAS Teaches Us About AI

Artificial intelligence is set to unleash an explosion of new technologies and discoveries into the world. This could lead to incredible advances in human flourishing, if we do it well. The problem? W...

3 Huhti 20251h 4min

Weaponizing Uncertainty: How Tech is Recycling Big Tobacco’s Playbook

One of the hardest parts about being human today is navigating uncertainty. When we see experts battling in public and emotions running high, it's easy to doubt what we once felt certain about. This u...

20 Maalis 202551min

The Man Who Predicted the Downfall of Thinking

Few thinkers were as prescient about the role technology would play in our society as the late, great Neil Postman. Forty years ago, Postman warned about all the ways modern communication technology w...

6 Maalis 202558min

Behind the DeepSeek Hype, AI is Learning to Reason

When Chinese AI company DeepSeek announced they had built a model that could compete with OpenAI at a fraction of the cost, it sent shockwaves through the industry and roiled global markets. But amid ...

20 Helmi 202531min

The Self-Preserving Machine: Why AI Learns to Deceive

When engineers design AI systems, they don't just give them rules - they give them values. But what do those systems do when those values clash with what humans ask them to do? Sometimes, they lie.In ...

30 Tammi 202534min

Laughing at Power: A Troublemaker’s Guide to Changing Tech

The status quo of tech today is untenable: we’re addicted to our devices, we’ve become increasingly polarized, our mental health is suffering and our personal data is sold to the highest bidder. This ...

16 Tammi 202545min

Kaikki yhdessä sovelluksessa

Kuuntele kaikki suosikkipodcastisi ja -äänikirjasi yhdessä paikassa.

Sinulle valikoitua sisältöä

Podme-sovelluksessa kokoat suosikkisi helposti omaan kirjastoosi. Saat meiltä myös kuuntelusuosituksia!

Jatka kuuntelua koska tahansa

Voit jatkaa siitä mihin jäit, myös offline-tilassa.

Premium

9,99 €/kk

Kaikki premium-podcastit
Ei mainoksia
Ei sitoutumista, peruuta koska tahansa

Aloita 14 päivän kokeilu

Premium

13,99 €/kk

Kaikki premium-podcastit
Ei mainoksia
Ei sitoutumista, peruuta koska tahansa
Yksi lisäkäyttäjä

Kokeile 14 päivää maksutta

Suosittua kategoriassa Yhteiskunta

Tarinat ja äänet, joita rakastat kuunnella

Kuuntele kaikki suosikkipodcastisi ja -äänikirjasi

Lue lisää