#92 – Brian Christian on the alignment problem

#92 – Brian Christian on the alignment problem

Brian Christian is a bestselling author with a particular knack for accurately communicating difficult or technical ideas from both mathematics and computer science.

Listeners loved our episode about his book Algorithms to Live By — so when the team read his new book, The Alignment Problem, and found it to be an insightful and comprehensive review of the state of the research into making advanced AI useful and reliably safe, getting him back on the show was a no-brainer.

Brian has so much of substance to say this episode will likely be of interest to people who know a lot about AI as well as those who know a little, and of interest to people who are nervous about where AI is going as well as those who aren't nervous at all.

Links to learn more, summary and full transcript.

Here’s a tease of 10 Hollywood-worthy stories from the episode:

The Riddle of Dopamine: The development of reinforcement learning solves a long-standing mystery of how humans are able to learn from their experience.
ALVINN: A student teaches a military vehicle to drive between Pittsburgh and Lake Erie, without intervention, in the early 1990s, using a computer with a tenth the processing capacity of an Apple Watch.
Couch Potato: An agent trained to be curious is stopped in its quest to navigate a maze by a paralysing TV screen.
Pitts & McCulloch: A homeless teenager and his foster father figure invent the idea of the neural net.
Tree Senility: Agents become so good at living in trees to escape predators that they forget how to leave, starve, and die.
The Danish Bicycle: A reinforcement learning agent figures out that it can better achieve its goal by riding in circles as quickly as possible than reaching its purported destination.
Montezuma's Revenge: By 2015 a reinforcement learner can play 60 different Atari games — the majority impossibly well — but can’t score a single point on one game humans find tediously simple.
Curious Pong: Two novelty-seeking agents, forced to play Pong against one another, create increasingly extreme rallies.
AlphaGo Zero: A computer program becomes superhuman at Chess and Go in under a day by attempting to imitate itself.
Robot Gymnasts: Over the course of an hour, humans teach robots to do perfect backflips just by telling them which of 2 random actions look more like a backflip.

We also cover:

• How reinforcement learning actually works, and some of its key achievements and failures
• How a lack of curiosity can cause AIs to fail to be able to do basic things
• The pitfalls of getting AI to imitate how we ourselves behave
• The benefits of getting AI to infer what we must be trying to achieve
• Why it’s good for agents to be uncertain about what they're doing
• Why Brian isn’t that worried about explicit deception
• The interviewees Brian most agrees with, and most disagrees with
• Developments since Brian finished the manuscript
• The effective altruism and AI safety communities
• And much more

Producer: Keiran Harris.
Audio mastering: Ben Cordell.
Transcriptions: Sofia Davis-Fogel.

Avsnitt(333)

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

You might have heard that '95% of corporate AI pilots' are failing. It was one of the most widely cited AI statistics of 2025, parroted by media outlets everywhere. It helped trigger a Nasdaq selloff ...

28 Apr 10min

#242 – Will MacAskill on how we survive the 'intelligence explosion,' AI character, and the case for 'viatopia'

#242 – Will MacAskill on how we survive the 'intelligence explosion,' AI character, and the case for 'viatopia'

Hundreds of millions already turn to AI on the most personal of topics — therapy, political opinions, and how to treat others. And as AI takes over more of the economy, the character of these systems ...

22 Apr 3h 9min

Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)

Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)

Hundreds of prominent AI scientists and other notable figures signed a statement in 2023 saying that mitigating the risk of extinction from AI should be a global priority. At 80,000 Hours, we’ve consi...

16 Apr 1h 29min

How scary is Claude Mythos? 303 pages in 21 minutes

How scary is Claude Mythos? 303 pages in 21 minutes

With Claude Mythos we have an AI that knows when it's being tested, can obscure its reasoning when it wants, and is better at breaking into (and out of) computers than any human alive. Rob Wiblin work...

10 Apr 21min

Village gossip, pesticide bans, and gene drives: 17 experts on the future of global health

Village gossip, pesticide bans, and gene drives: 17 experts on the future of global health

What does it really take to lift millions out of poverty and prevent needless deaths?In this special compilation episode, 17 past guests — including economists, nonprofit founders, and policy advisors...

7 Apr 4h 6min

What everyone is missing about Anthropic vs the Pentagon. And: The Meta leaks are worse than you think.

What everyone is missing about Anthropic vs the Pentagon. And: The Meta leaks are worse than you think.

When the Pentagon tried to strong-arm Anthropic into dropping its ban on AI-only kill decisions and mass domestic surveillance, the company refused. Its critics went on the attack: Anthropic and its s...

3 Apr 20min

#241 – Richard Moulange on how now AI codes viable genomes from scratch and outperforms virologists at lab work — what could go wrong?

#241 – Richard Moulange on how now AI codes viable genomes from scratch and outperforms virologists at lab work — what could go wrong?

Last September, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirel...

31 Mars 3h 7min

#240 – Samuel Charap on how a Ukraine ceasefire could accidentally set Europe up for a bigger war

#240 – Samuel Charap on how a Ukraine ceasefire could accidentally set Europe up for a bigger war

Many people believe a ceasefire in Ukraine will leave Europe safer. But today's guest lays out how a deal could potentially generate insidious new risks — leaving us in a situation that's equally dang...

24 Mars 1h 12min

Populärt inom Utbildning

rss-bara-en-till-om-missbruk-medberoende-2
historiepodden-se
det-skaver
harrisons-dramatiska-historia
nu-blir-det-historia
roda-vita-rosen
johannes-hansen-podcast
allt-du-velat-veta
rss-viktmedicinpodden
sektledare
i-vantan-pa-katastrofen
not-fanny-anymore
rss-foraldramotet-bring-lagercrantz
rss-max-tant-med-max-villman
rss-sjalsligt-avkladd
rikatillsammans-om-privatekonomi-rikedom-i-livet
sa-in-i-sjalen
rss-npf-podden
rss-basta-livet
rss-traningsklubben