#235 – Ajeya Cotra on whether it’s crazy that every AI company’s safety plan is ‘use AI to make AI safe’

#235 – Ajeya Cotra on whether it’s crazy that every AI company’s safety plan is ‘use AI to make AI safe’

Every major AI company has the same safety plan: when AI gets crazy powerful and really dangerous, they’ll use the AI itself to figure out how to make AI safe and beneficial. It sounds circular, almost satirical. But is it actually a bad plan?

Today’s guest, Ajeya Cotra, recently placed 3rd out of 413 participants forecasting AI developments and is among the most thoughtful and respected commentators on where the technology is going.

She thinks there’s a meaningful chance we’ll see as much change in the next 23 years as humanity faced in the last 10,000, thanks to the arrival of artificial general intelligence. Ajeya doesn’t reach this conclusion lightly: she’s had a ring-side seat to the growth of all the major AI companies for 10 years — first as a researcher and grantmaker for technical AI safety at Coefficient Giving (formerly known as Open Philanthropy), and now as a member of technical staff at METR.

So host Rob Wiblin asked her: is this plan to use AI to save us from AI a reasonable one?

Ajeya agrees that humanity has repeatedly used technologies that create new problems to help solve those problems. After all:

  • Cars enabled carjackings and drive-by shootings, but also faster police pursuits.
  • Microbiology enabled bioweapons, but also faster vaccine development.
  • The internet allowed lies to disseminate faster, but had exactly the same impact for fact checks.

But she also thinks this will be a much harder case. In her view, the window between AI automating AI research and the arrival of uncontrollably powerful superintelligence could be quite brief — perhaps a year or less. In that narrow window, we’d need to redirect enormous amounts of AI labour away from making AI smarter and towards alignment research, biodefence, cyberdefence, adapting our political structures, and improving our collective decision-making.

The plan might fail just because the idea is flawed at conception: it does sound a bit crazy to use an AI you don’t trust to make sure that same AI benefits humanity.

But if we find some clever technique to overcome that, we could still fail — because the companies simply don’t follow through on their promises. They say redirecting resources to alignment and security is their strategy for dealing with the risks generated by their research — but none have quantitative commitments about what fraction of AI labour they’ll redirect during crunch time. And the competitive pressures during a recursive self-improvement loop could be irresistible.

In today’s conversation, Ajeya and Rob discuss what assumptions this plan requires, the specific problems AI could help solve during crunch time, and why — even if we pull it off — we’ll be white-knuckling it the whole way through.


Links to learn more, video, and full transcript: https://80k.info/ac26

This episode was recorded on October 20, 2025.

Chapters:

  • Cold open (00:00:00)
  • Ajeya’s strong track record for identifying key AI issues (00:00:43)
  • The 1,000-fold disagreement about AI's effect on economic growth (00:02:30)
  • Could any evidence actually change people's minds? (00:22:48)
  • The most dangerous AI progress might remain secret (00:29:55)
  • White-knuckling the 12-month window after automated AI R&D (00:46:16)
  • AI help is most valuable right before things go crazy (01:10:36)
  • Foundations should go from paying researchers to paying for inference (01:23:08)
  • Will frontier AI even be for sale during the explosion? (01:30:21)
  • Pre-crunch prep: what we should do right now (01:42:10)
  • A grantmaking trial by fire at Coefficient Giving (01:45:12)
  • Sabbatical and reflections on effective altruism (02:05:32)
  • The mundane factors that drive career satisfaction (02:34:33)
  • EA as an incubator for avant-garde causes others won't touch (02:44:07)

Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
Music: CORBIT
Coordination, transcriptions, and web: Katy Moore

Jaksot(325)

#12 - Beth Cameron works to stop you dying in a pandemic. Here’s what keeps her up at night.

#12 - Beth Cameron works to stop you dying in a pandemic. Here’s what keeps her up at night.

“When you're in the middle of a crisis and you have to ask for money, you're already too late.” That’s Dr Beth Cameron, who leads Global Biological Policy and Programs at the Nuclear Threat Initiative...

25 Loka 20171h 45min

#11 - Spencer Greenberg on speeding up social science 10-fold & why plenty of startups cause harm

#11 - Spencer Greenberg on speeding up social science 10-fold & why plenty of startups cause harm

Do most meat eaters think it’s wrong to hurt animals? Do Americans think climate change is likely to cause human extinction? What is the best, state-of-the-art therapy for depression? How can we make ...

17 Loka 20171h 29min

#10 - Nick Beckstead on how to spend billions of dollars preventing human extinction

#10 - Nick Beckstead on how to spend billions of dollars preventing human extinction

What if you were in a position to give away billions of dollars to improve the world? What would you do with it? This is the problem facing Program Officers at the Open Philanthropy Project - people l...

11 Loka 20171h 51min

#9 - Christine Peterson on how insecure computers could lead to global disaster, and how to fix it

#9 - Christine Peterson on how insecure computers could lead to global disaster, and how to fix it

Take a trip to Silicon Valley in the 70s and 80s, when going to space sounded like a good way to get around environmental limits, people started cryogenically freezing themselves, and nanotechnology l...

4 Loka 20171h 45min

#8 - Lewis Bollard on how to end factory farming in our lifetimes

#8 - Lewis Bollard on how to end factory farming in our lifetimes

Every year tens of billions of animals are raised in terrible conditions in factory farms before being killed for human consumption. Over the last two years Lewis Bollard – Project Officer for Farm An...

27 Syys 20173h 16min

#7 - Julia Galef on making humanity more rational, what EA does wrong, and why Twitter isn’t all bad

#7 - Julia Galef on making humanity more rational, what EA does wrong, and why Twitter isn’t all bad

The scientific revolution in the 16th century was one of the biggest societal shifts in human history, driven by the discovery of new and better methods of figuring out who was right and who was wrong...

13 Syys 20171h 14min

#6 - Toby Ord on why the long-term future matters more than anything else & what to do about it

#6 - Toby Ord on why the long-term future matters more than anything else & what to do about it

Of all the people whose well-being we should care about, only a small fraction are alive today. The rest are members of future generations who are yet to exist. Whether they’ll be born into a world th...

6 Syys 20172h 8min

#5 - Alex Gordon-Brown on how to donate millions in your 20s working in quantitative trading

#5 - Alex Gordon-Brown on how to donate millions in your 20s working in quantitative trading

Quantitative financial trading is one of the highest paying parts of the world’s highest paying industry. 25 to 30 year olds with outstanding maths skills can earn millions a year in an obscure set of...

28 Elo 20171h 45min

Suosittua kategoriassa Koulutus

rss-murhan-anatomia
voi-hyvin-meditaatiot-2
rss-narsisti
psykopodiaa-podcast
adhd-podi
rss-rahamania
rss-valo-minussa-2
rss-vapaudu-voimaasi
rss-niinku-asia-on
mielipaivakirja
rss-uskonto-on-tylsaa
aamukahvilla
rss-duodecim-lehti
ilona-rauhala
kesken
psykologia
rss-eron-alkemiaa
rss-koira-haudattuna
rss-arkea-ja-aurinkoa-podcast-espanjasta
ihminen-tavattavissa-tommy-hellsten-instituutti