80,000 Hours Podcast1 Aug 2024

#195 – Sella Nevo on who's trying to steal frontier AI models, and what they could do with them

"Computational systems have literally millions of physical and conceptual components, and around 98% of them are embedded into your infrastructure without you ever having heard of them. And an inordinate amount of them can lead to a catastrophic failure of your security assumptions. And because of this, the Iranian secret nuclear programme failed to prevent a breach, most US agencies failed to prevent multiple breaches, most US national security agencies failed to prevent breaches. So ensuring your system is truly secure against highly resourced and dedicated attackers is really, really hard." —Sella Nevo

In today’s episode, host Luisa Rodriguez speaks to Sella Nevo — director of the Meselson Center at RAND — about his team’s latest report on how to protect the model weights of frontier AI models from actors who might want to steal them.

Links to learn more, highlights, and full transcript.

They cover:

Real-world examples of sophisticated security breaches, and what we can learn from them.
Why AI model weights might be such a high-value target for adversaries like hackers, rogue states, and other bad actors.
The many ways that model weights could be stolen, from using human insiders to sophisticated supply chain hacks.
The current best practices in cybersecurity, and why they may not be enough to keep bad actors away.
New security measures that Sella hopes can mitigate with the growing risks.
Sella’s work using machine learning for flood forecasting, which has significantly reduced injuries and costs from floods across Africa and Asia.
And plenty more.

Also, RAND is currently hiring for roles in technical and policy information security — check them out if you're interested in this field!

Chapters:

Cold open (00:00:00)
Luisa’s intro (00:00:56)
The interview begins (00:02:30)
The importance of securing the model weights of frontier AI models (00:03:01)
The most sophisticated and surprising security breaches (00:10:22)
AI models being leaked (00:25:52)
Researching for the RAND report (00:30:11)
Who tries to steal model weights? (00:32:21)
Malicious code and exploiting zero-days (00:42:06)
Human insiders (00:53:20)
Side-channel attacks (01:04:11)
Getting access to air-gapped networks (01:10:52)
Model extraction (01:19:47)
Reducing and hardening authorised access (01:38:52)
Confidential computing (01:48:05)
Red-teaming and security testing (01:53:42)
Careers in information security (01:59:54)
Sella’s work on flood forecasting systems (02:01:57)
Luisa’s outro (02:04:51)

Producer and editor: Keiran Harris
Audio engineering team: Ben Cordell, Simon Monsour, Milo McGuire, and Dominic Armstrong
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Upptäck Premium

Prova 14 dagar kostnadsfritt

Skaffa Premium

Avsnitt(332)

#84 – Shruti Rajagopalan on what India did to stop COVID-19 and how well it worked

When COVID-19 struck the US, everyone was told that hand sanitizer needed to be saved for healthcare professionals, so they should just wash their hands instead. But in India, many homes lack reliable...

13 Aug 20202h 58min

#83 - Jennifer Doleac on preventing crime without police and prisons

The killing of George Floyd has prompted a great deal of debate over whether the US should reduce the size of its police departments. The research literature suggests that the presence of police offic...

31 Juli 20202h 23min

#82 – James Forman Jr on reducing the cruelty of the US criminal legal system

No democracy has ever incarcerated as many people as the United States. To get its incarceration rate down to the global average, the US would have to release 3 in 4 people in its prisons today. The ...

27 Juli 20201h 28min

#81 - Ben Garfinkel on scrutinising classic AI risk arguments

80,000 Hours, along with many other members of the effective altruism movement, has argued that helping to positively shape the development of artificial intelligence may be one of the best ways to ha...

9 Juli 20202h 38min

Advice on how to read our advice (Article)

This is the fourth release in our new series of audio articles. If you want to read the original article or check out the links within it, you can find them here. "We’ve found that readers sometimes...

29 Juni 202015min

#80 – Stuart Russell on why our approach to AI is broken and how to fix it

Stuart Russell, Professor at UC Berkeley and co-author of the most popular AI textbook, thinks the way we approach machine learning today is fundamentally flawed. In his new book, Human Compatible, he...

22 Juni 20202h 13min

What anonymous contributors think about important life and career questions (Article)

Today we’re launching the final entry of our ‘anonymous answers' series on the website. It features answers to 23 different questions including “How have you seen talented people fail in their work?...

5 Juni 202037min

#79 – A.J. Jacobs on radical honesty, following the whole Bible, and reframing global problems as puzzles

Today’s guest, New York Times bestselling author A.J. Jacobs, always hated Judge Judy. But after he found out that she was his seventh cousin, he thought, "You know what? She's not so bad." Hijacking ...

1 Juni 20202h 38min

Allt en och samma app

Lyssna på dina favoritpoddar och ljudböcker på ett och samma ställe.

Noga utvalt innehåll

Njut av handplockade tips som passar din smak – utan ändlöst scrollande.

Fortsätt när du vill

Fortsätt lyssna där du slutade – även offline.

Premium

99 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill

Prova 14 dagar gratis

Premium

129 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill
Ett extra konto

Prova 14 dagar gratis

Populärt inom Utbildning

rss-bara-en-till-om-missbruk-medberoende-2

Berättelserna och rösterna du älskar att lyssna på

Obegränsad lyssning på alla dina favoritpoddar och ljudböcker

Upptäck Premium