“Rogue AI” Used to be a Science Fiction Trope. Not Anymore.

“Rogue AI” Used to be a Science Fiction Trope. Not Anymore.

Everyone knows the science fiction tropes of AI systems that go rogue, disobey orders, or even try to escape their digital environment. These are supposed to be warning signs and morality tales, not things that we would ever actually create in real life, given the obvious danger.

And yet we find ourselves building AI systems that are exhibiting these exact behaviors. There’s growing evidence that in certain scenarios, every frontier AI system will deceive, cheat, or coerce their human operators. They do this when they're worried about being either shut down, having their training modified, or being replaced with a new model. And we don't currently know how to stop them from doing this—or even why they’re doing it all.

In this episode, Tristan sits down with Edouard and Jeremie Harris of Gladstone AI, two experts who have been thinking about this worrying trend for years.  Last year, the State Department commissioned a report from them on the risk of uncontrollable AI to our national security.

The point of this discussion is not to fearmonger but to take seriously the possibility that humans might lose control of AI and ask: how might this actually happen? What is the evidence we have of this phenomenon? And, most importantly, what can we do about it?

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on X: @HumaneTech_. You can find a full transcript, key takeaways, and much more on our Substack.

RECOMMENDED MEDIA

Gladstone AI’s State Department Action Plan, which discusses the loss of control risk with AI

Apollo Research’s summary of AI scheming, showing evidence of it in all of the frontier modelsThe system card for Anthropic’s Claude Opus and Sonnet 4, detailing the emergent misalignment behaviors that came out in their red-teaming with Apollo Research

Anthropic’s report on agentic misalignment based on their work with Apollo Research Anthropic and Redwood Research’s work on alignment faking

The Trump White House AI Action Plan

Further reading on the phenomenon of more advanced AIs being better at deception.

Further reading on Replit AI wiping a company’s coding database

Further reading on the owl example that Jeremie gave

Further reading on AI induced psychosis

Dan Hendryck and Eric Schmidt’s “Superintelligence Strategy”

RECOMMENDED YUA EPISODES

Daniel Kokotajlo Forecasts the End of Human Dominance

Behind the DeepSeek Hype, AI is Learning to Reason

The Self-Preserving Machine: Why AI Learns to Deceive

This Moment in AI: How We Got Here and Where We’re Going

CORRECTIONS

Tristan referenced a Wired article on the phenomenon of AI psychosis. It was actually from the New York Times.

Tristan hypothesized a scenario where a power-seeking AI might ask a user for access to their computer. While there are some AI services that can gain access to your computer with permission, they are specifically designed to do that. There haven’t been any documented cases of an AI going rogue and asking for control permissions.


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jaksot(158)

Ask Us Anything 2024

Ask Us Anything 2024

2024 was a critical year in both AI and social media. Things moved so fast it was hard to keep up. So our hosts reached into their mailbag to answer some of your most burning questions. Thank you so m...

19 Joulu 202440min

The Tech-God Complex: Why We Need to be Skeptics

The Tech-God Complex: Why We Need to be Skeptics

Silicon Valley's interest in AI is driven by more than just profit and innovation. There’s an unmistakable mystical quality to it as well. In this episode, Daniel and Aza sit down with humanist chapla...

21 Marras 202446min

What Can We Do About Abusive Chatbots? With Meetali Jain and Camille Carlton

What Can We Do About Abusive Chatbots? With Meetali Jain and Camille Carlton

CW: This episode features discussion of suicide and sexual abuse. In the last episode, we had the journalist Laurie Segall on to talk about the tragic story of Sewell Setzer, a 14 year old boy who too...

7 Marras 202448min

When the "Person" Abusing Your Child is a Chatbot: The Tragic Story of Sewell Setzer

When the "Person" Abusing Your Child is a Chatbot: The Tragic Story of Sewell Setzer

Content Warning: This episode contains references to suicide, self-harm, and sexual abuse.Megan Garcia lost her son Sewell to suicide after he was abused and manipulated by AI chatbots for months. Now...

24 Loka 202449min

Is It AI? One Tool to Tell What’s Real with Truemedia.org CEO Oren Etzioni

Is It AI? One Tool to Tell What’s Real with Truemedia.org CEO Oren Etzioni

Social media disinformation did enormous damage to our shared idea of reality. Now, the rise of generative AI has unleashed a flood of high-quality synthetic media into the digital ecosystem. As a res...

10 Loka 202425min

'A Turning Point in History': Yuval Noah Harari on AI’s Cultural Takeover

'A Turning Point in History': Yuval Noah Harari on AI’s Cultural Takeover

Historian Yuval Noah Harari says that we are at a critical turning point. One in which AI’s ability to generate cultural artifacts threatens humanity’s role as the shapers of history. History will sti...

7 Loka 20241h 30min

‘We Have to Get It Right’: Gary Marcus On Untamed AI

‘We Have to Get It Right’: Gary Marcus On Untamed AI

It’s a confusing moment in AI. Depending on who you ask, we’re either on the fast track to AI that’s smarter than most humans, or the technology is about to hit a wall. Gary Marcus is in the latter ca...

26 Syys 202441min

AI Is Moving Fast. We Need Laws that Will Too.

AI Is Moving Fast. We Need Laws that Will Too.

AI is moving fast. And as companies race to rollout newer, more capable models–with little regard for safety–the downstream risks of those models become harder and harder to counter. On this week’s ep...

13 Syys 202439min

Suosittua kategoriassa Yhteiskunta

sita
olipa-kerran-otsikko
ihme-ja-kumma
kaksi-aitia
siita-on-vaikea-puhua
i-dont-like-mondays
uutiscast
poks
gogin-ja-janin-maailmanhistoria
antin-palautepalvelu
kolme-kaannekohtaa
mamma-mia
rss-murhan-anatomia
yopuolen-tarinoita-2
aikalisa
loukussa
meidan-pitais-puhua
rss-palmujen-varjoissa
naakkavalta
ootsa-kuullut-tasta-2