#44 Classic episode - Paul Christiano on finding real solutions to the AI alignment problem

#44 Classic episode - Paul Christiano on finding real solutions to the AI alignment problem

Rebroadcast: this episode was originally released in October 2018.

Paul Christiano is one of the smartest people I know. After our first session produced such great material, we decided to do a second recording, resulting in our longest interview so far. While challenging at times I can strongly recommend listening — Paul works on AI himself and has a very unusually thought through view of how it will change the world. This is now the top resource I'm going to refer people to if they're interested in positively shaping the development of AI, and want to understand the problem better. Even though I'm familiar with Paul's writing I felt I was learning a great deal and am now in a better position to make a difference to the world.

A few of the topics we cover are:

• Why Paul expects AI to transform the world gradually rather than explosively and what that would look like
• Several concrete methods OpenAI is trying to develop to ensure AI systems do what we want even if they become more competent than us
• Why AI systems will probably be granted legal and property rights
• How an advanced AI that doesn't share human goals could still have moral value
• Why machine learning might take over science research from humans before it can do most other tasks
• Which decade we should expect human labour to become obsolete, and how this should affect your savings plan.

Links to learn more, summary and full transcript.
Rohin Shah's AI alignment newsletter.

Here's a situation we all regularly confront: you want to answer a difficult question, but aren't quite smart or informed enough to figure it out for yourself. The good news is you have access to experts who *are* smart enough to figure it out. The bad news is that they disagree.

If given plenty of time — and enough arguments, counterarguments and counter-counter-arguments between all the experts — should you eventually be able to figure out which is correct? What if one expert were deliberately trying to mislead you? And should the expert with the correct view just tell the whole truth, or will competition force them to throw in persuasive lies in order to have a chance of winning you over?

In other words: does 'debate', in principle, lead to truth?

According to Paul Christiano — researcher at the machine learning research lab OpenAI and legendary thinker in the effective altruism and rationality communities — this question is of more than mere philosophical interest. That's because 'debate' is a promising method of keeping artificial intelligence aligned with human goals, even if it becomes much more intelligent and sophisticated than we are.

It's a method OpenAI is actively trying to develop, because in the long-term it wants to train AI systems to make decisions that are too complex for any human to grasp, but without the risks that arise from a complete loss of human oversight.

If AI-1 is free to choose any line of argument in order to attack the ideas of AI-2, and AI-2 always seems to successfully defend them, it suggests that every possible line of argument would have been unsuccessful.

But does that mean that the ideas of AI-2 were actually right? It would be nice if the optimal strategy in debate were to be completely honest, provide good arguments, and respond to counterarguments in a valid way. But we don't know that's the case.

The 80,000 Hours Podcast is produced by Keiran Harris.

Episoder(320)

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

#221 – Kyle Fish on the most bizarre findings from 5 AI welfare experiments

What happens when you lock two AI systems in a room together and tell them they can discuss anything they want?According to experiments run by Kyle Fish — Anthropic’s first AI welfare researcher — som...

28 Aug 20252h 28min

How not to lose your job to AI (article by Benjamin Todd)

How not to lose your job to AI (article by Benjamin Todd)

About half of people are worried they’ll lose their job to AI. They’re right to be concerned: AI can now complete real-world coding tasks on GitHub, generate photorealistic video, drive a taxi more sa...

31 Jul 202551min

Rebuilding after apocalypse: What 13 experts say about bouncing back

Rebuilding after apocalypse: What 13 experts say about bouncing back

What happens when civilisation faces its greatest tests?This compilation brings together insights from researchers, defence experts, philosophers, and policymakers on humanity’s ability to survive and...

15 Jul 20254h 26min

#220 – Ryan Greenblatt on the 4 most likely ways for AI to take over, and the case for and against AGI in <8 years

#220 – Ryan Greenblatt on the 4 most likely ways for AI to take over, and the case for and against AGI in <8 years

Ryan Greenblatt — lead author on the explosive paper “Alignment faking in large language models” and chief scientist at Redwood Research — thinks there’s a 25% chance that within four years, AI will b...

8 Jul 20252h 50min

#219 – Toby Ord on graphs AI companies would prefer you didn't (fully) understand

#219 – Toby Ord on graphs AI companies would prefer you didn't (fully) understand

The era of making AI smarter just by making it bigger is ending. But that doesn’t mean progress is slowing down — far from it. AI models continue to get much more powerful, just using very different m...

24 Jun 20252h 48min

#218 – Hugh White on why Trump is abandoning US hegemony – and that’s probably good

#218 – Hugh White on why Trump is abandoning US hegemony – and that’s probably good

For decades, US allies have slept soundly under the protection of America’s overwhelming military might. Donald Trump — with his threats to ditch NATO, seize Greenland, and abandon Taiwan — seems hell...

12 Jun 20252h 48min

#217 – Beth Barnes on the most important graph in AI right now — and the 7-month rule that governs its progress

#217 – Beth Barnes on the most important graph in AI right now — and the 7-month rule that governs its progress

AI models today have a 50% chance of successfully completing a task that would take an expert human one hour. Seven months ago, that number was roughly 30 minutes — and seven months before that, 15 mi...

2 Jun 20253h 47min

Beyond human minds: The bewildering frontier of consciousness in insects, AI, and more

Beyond human minds: The bewildering frontier of consciousness in insects, AI, and more

What if there’s something it’s like to be a shrimp — or a chatbot?For centuries, humans have debated the nature of consciousness, often placing ourselves at the very top. But what about the minds of o...

23 Mai 20253h 34min

Populært innen Fakta

fastlegen
dine-penger-pengeradet
relasjonspodden-med-dora-thorhallsdottir-kjersti-idem
treningspodden
rss-strid-de-norske-borgerkrigene
foreldreradet
jakt-og-fiskepodden
rss-sunn-okonomi
merry-quizmas
gravid-uke-for-uke
fryktlos
sinnsyn
smart-forklart
rss-mann-i-krise-med-sagen
hverdagspsyken
rss-kunsten-a-leve
dopet
aldring-og-helse-podden
rss-adhd-i-klasserommet
generasjonspodden