#243 – 'Godfather of AI' Yoshua Bengio: "I now see a path" to safe superintelligent AI

The co-inventor of modern AI and the most cited living scientist believes he's figured out how to ensure AI is honest, incapable of deception, and never goes rogue. Yoshua Bengio – Turing Award Winner and founder of LawZero – is disturbed by the many unintended drives and goals present in today's AIs, their willingness to lie, and ability to tell when they're being tested. AI companies are trying to stamp out these behaviours in a 'cat-and-mouse game' that Yoshua fears they're losing.

---

Our new book is "a ridiculously in-depth guide to finding a fulfilling career that does good" and is out now! Order from your local bookstore, or online at https://80k.info/career-guide

---

But Yoshua is optimistic: he believes the companies can win this battle decisively with a single rearrangement to how AI models are trained, and has been developing mathematical proofs to back up the claim. The core idea is that instead of training AI to predict what a human would say, or to produce responses we'd rate highly, we should train it to model what's actually true.

Yoshua argues this new architecture, which he calls 'Scientist AI,' is a small enough change that we could keep almost all the techniques and data we use to train frontier AIs like Claude and ChatGPT. And that the new architecture need not cost more, could be built iteratively, and might be more capable as well as more honest.

Links to learn more, video, and full transcript: https://80k.info/bengio

Until recently, the biggest practical objection to Scientist AI was simple: the world wants agents, and Scientist AI isn’t one. But in new research, Yoshua has extended the design and believes the same honest predictor can be turned into a capable agent without losing its "safety guarantees."

With the Scientist AI proposal on the table, Yoshua argues that it's absurd to race to get current untrustworthy AI models to design their successors, which the leading companies are attempting to do as soon as possible.

But critics argue the approach wouldn't be so technically solid in practice, and that frontier capabilities are advancing so fast, and cost so much to match, that Scientist AI risks arriving too late to matter.

Host Rob Wiblin and AI pioneer Yoshua Bengio cover all this and more in today's conversation.

LawZero is hiring! https://80k.info/lawzero-jobs

This episode was recorded on April 16, 2026.

Chapters:

Yoshua Bengio on making AI honest and safe (00:00:00)
The Scientist AI in plain English (00:02:27)
Yoshua on how Scientist AI differs from LLMs (00:06:32)
How the training data works (00:14:02)
Can this become an agent? (00:21:02)
Why Yoshua is more optimistic on alignment now (00:32:11)
Why companies can’t stop racing (00:36:35)
How close to a working prototype? (00:49:15)
Honest models might be more capable (00:53:34)
“Reinforcement learning is evil” (01:01:27)
Scientist AI from guardrail to agent (01:08:37)
Can safe AI still be competent? (01:12:38)
How much will this cost? (01:19:29)
Can it generalise beyond maths and science? (01:23:26)
A UN for superintelligence (01:39:19)
Want to work with Yoshua Bengio? (01:51:16)
Why smart people ignore AI risk (01:54:45)
Don’t let AI build the next AI (02:01:33)
Why the public doesn’t get the real risk (02:12:28)
Why Yoshua changed his mind about AI risk (02:21:27)

Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour
Camera operator: Jeremy Chevillotte
Production: Nick Stockton, Elizabeth Cox, and Katy Moore

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(340)

We can guess what intergalactic war would look like. And strangely, it matters.

Intergalactic war is probably billions of years away — yet physics can already tell us how it ends. And strangely that conclusion is relevant to decisions people have to make today.In this video, Rob ...

18 Kesä 15min

How AI could create the world’s biggest problems (article by Zershaaneh Qureshi)

Imagine you’re living 15,000 years ago. Your people are hunter-gatherers and you sleep under the stars. If someone told you humans would one day build cities with millions of people, fly through the a...

11 Kesä 1h 29min

What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Most people working on AI safety think without a massive effort AI systems will probably end up with goals catastrophically different from humanity’s. Today’s guest, Rohin Shah — head of AGI Safety an...

2 Kesä 2h 48min

What makes for a dream job? | Benjamin Todd

What actually makes a job fulfilling? It's not what most career advice tells you. "Follow your passion" sounds inspiring, but it's misleading — and the research backs that up.Drawing on hundreds of st...

28 Touko 28min

We’re updating our career advice for the strangest time in history | Benjamin Todd, author of 80,000 Hours

The average career is 80,000 hours long. With AI advancing so rapidly, the hours you have left in your career matter more than ever.Some leading AI researchers think there’s a 10% chance that AI syste...

26 Touko 1h 6min

Can AIs already start 'rogue deployments' inside AI companies? (Landmark new METR report)

A red-teamer was embedded inside Anthropic for three weeks, told to imagine he was an evil Claude, and asked to figure out how to launch a ‘rogue AI deployment’ without getting caught. It’s one part o...

20 Touko 20min

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

You might have heard that '95% of corporate AI pilots' are failing. It was one of the most widely cited AI statistics of 2025, parroted by media outlets everywhere. It helped trigger a Nasdaq selloff ...

28 Huhti 10min