#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk

#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk

Back in 2007, Holden Karnofsky cofounded GiveWell, where he sought out the charities that most cost-effectively helped save lives. He then cofounded Open Philanthropy, where he oversaw a team making billions of dollars’ worth of grants across a range of areas: pandemic control, criminal justice reform, farmed animal welfare, and making AI safe, among others. This year, having learned about AI for years and observed recent events, he's narrowing his focus once again, this time on making the transition to advanced AI go well.

In today's conversation, Holden returns to the show to share his overall understanding of the promise and the risks posed by machine intelligence, and what to do about it. That understanding has accumulated over around 14 years, during which he went from being sceptical that AI was important or risky, to making AI risks the focus of his work.

Links to learn more, summary and full transcript.

(As Holden reminds us, his wife is also the president of one of the world's top AI labs, Anthropic, giving him both conflicts of interest and a front-row seat to recent events. For our part, Open Philanthropy is 80,000 Hours' largest financial supporter.)

One point he makes is that people are too narrowly focused on AI becoming 'superintelligent.' While that could happen and would be important, it's not necessary for AI to be transformative or perilous. Rather, machines with human levels of intelligence could end up being enormously influential simply if the amount of computer hardware globally were able to operate tens or hundreds of billions of them, in a sense making machine intelligences a majority of the global population, or at least a majority of global thought.

As Holden explains, he sees four key parts to the playbook humanity should use to guide the transition to very advanced AI in a positive direction: alignment research, standards and monitoring, creating a successful and careful AI lab, and finally, information security.

In today’s episode, host Rob Wiblin interviews return guest Holden Karnofsky about that playbook, as well as:

  • Why we can’t rely on just gradually solving those problems as they come up, the way we usually do with new technologies.
  • What multiple different groups can do to improve our chances of a good outcome — including listeners to this show, governments, computer security experts, and journalists.
  • Holden’s case against 'hardcore utilitarianism' and what actually motivates him to work hard for a better world.
  • What the ML and AI safety communities get wrong in Holden's view.
  • Ways we might succeed with AI just by dumb luck.
  • The value of laying out imaginable success stories.
  • Why information security is so important and underrated.
  • Whether it's good to work at an AI lab that you think is particularly careful.
  • The track record of futurists’ predictions.
  • And much more.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer: Keiran Harris
Audio Engineering Lead: Ben Cordell

Technical editing: Simon Monsour and Milo McGuire

Transcriptions: Katy Moore

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(340)

We can guess what intergalactic war would look like. And strangely, it matters.

We can guess what intergalactic war would look like. And strangely, it matters.

Intergalactic war is probably billions of years away — yet physics can already tell us how it ends. And strangely that conclusion is relevant to decisions people have to make today.In this video, Rob ...

18 Jun 15min

How AI could create the world’s biggest problems (article by Zershaaneh Qureshi)

How AI could create the world’s biggest problems (article by Zershaaneh Qureshi)

Imagine you’re living 15,000 years ago. Your people are hunter-gatherers and you sleep under the stars. If someone told you humans would one day build cities with millions of people, fly through the a...

11 Jun 1h 29min

What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Most people working on AI safety think without a massive effort AI systems will probably end up with goals catastrophically different from humanity’s. Today’s guest, Rohin Shah — head of AGI Safety an...

2 Jun 2h 48min

What makes for a dream job? | Benjamin Todd

What makes for a dream job? | Benjamin Todd

What actually makes a job fulfilling? It's not what most career advice tells you. "Follow your passion" sounds inspiring, but it's misleading — and the research backs that up.Drawing on hundreds of st...

28 Mai 28min

We’re updating our career advice for the strangest time in history | Benjamin Todd, author of 80,000 Hours

We’re updating our career advice for the strangest time in history | Benjamin Todd, author of 80,000 Hours

The average career is 80,000 hours long. With AI advancing so rapidly, the hours you have left in your career matter more than ever.Some leading AI researchers think there’s a 10% chance that AI syste...

26 Mai 1h 6min

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

A red-teamer was embedded inside Anthropic for three weeks, told to imagine he was an evil Claude, and asked to figure out how to launch a ‘rogue AI deployment’ without getting caught. It’s one part o...

20 Mai 20min

#243 – 'Godfather of AI' Yoshua Bengio: "I now see a path" to safe superintelligent AI

#243 – 'Godfather of AI' Yoshua Bengio: "I now see a path" to safe superintelligent AI

The co-inventor of modern AI and the most cited living scientist believes he's figured out how to ensure AI is honest, incapable of deception, and never goes rogue. Yoshua Bengio – Turing Award Winner...

7 Mai 2h 35min

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

You might have heard that '95% of corporate AI pilots' are failing. It was one of the most widely cited AI statistics of 2025, parroted by media outlets everywhere. It helped trigger a Nasdaq selloff ...

28 Apr 10min

Populært innen Fakta

fastlegen
dine-penger-pengeradet
relasjonspodden-med-dora-thorhallsdottir-kjersti-idem
foreldreradet
treningspodden
jakt-og-fiskepodden
rss-kull
takk-og-lov-med-anine-kierulf
mikkels-paskenotter
rss-strid-de-norske-borgerkrigene
rss-kunsten-a-leve
hverdagspsyken
sinnsyn
rss-bisarr-historie
rss-impressions-2
gravid-uke-for-uke
rss-mind-body-podden
level-up-med-anniken-binz
hagespiren-podcast
tomprat-med-gunnar-tjomlid