#184 – Zvi Mowshowitz on sleeping on sleeper agents, and the biggest AI updates since ChatGPT

#184 – Zvi Mowshowitz on sleeping on sleeper agents, and the biggest AI updates since ChatGPT

Many of you will have heard of Zvi Mowshowitz as a superhuman information-absorbing-and-processing machine — which he definitely is. As the author of the Substack Don’t Worry About the Vase, Zvi has spent as much time as literally anyone in the world over the last two years tracking in detail how the explosion of AI has been playing out — and he has strong opinions about almost every aspect of it.

Links to learn more, summary, and full transcript.

In today’s episode, host Rob Wiblin asks Zvi for his takes on:

  • US-China negotiations
  • Whether AI progress has stalled
  • The biggest wins and losses for alignment in 2023
  • EU and White House AI regulations
  • Which major AI lab has the best safety strategy
  • The pros and cons of the Pause AI movement
  • Recent breakthroughs in capabilities
  • In what situations it’s morally acceptable to work at AI labs

Whether you agree or disagree with his views, Zvi is super informed and brimming with concrete details.


Zvi and Rob also talk about:

  • The risk of AI labs fooling themselves into believing their alignment plans are working when they may not be.
  • The “sleeper agent” issue uncovered in a recent Anthropic paper, and how it shows us how hard alignment actually is.
  • Why Zvi disagrees with 80,000 Hours’ advice about gaining career capital to have a positive impact.
  • Zvi’s project to identify the most strikingly horrible and neglected policy failures in the US, and how Zvi founded a new think tank (Balsa Research) to identify innovative solutions to overthrow the horrible status quo in areas like domestic shipping, environmental reviews, and housing supply.
  • Why Zvi thinks that improving people’s prosperity and housing can make them care more about existential risks like AI.
  • An idea from the online rationality community that Zvi thinks is really underrated and more people should have heard of: simulacra levels.
  • And plenty more.

Chapters:

  • Zvi’s AI-related worldview (00:03:41)
  • Sleeper agents (00:05:55)
  • Safety plans of the three major labs (00:21:47)
  • Misalignment vs misuse vs structural issues (00:50:00)
  • Should concerned people work at AI labs? (00:55:45)
  • Pause AI campaign (01:30:16)
  • Has progress on useful AI products stalled? (01:38:03)
  • White House executive order and US politics (01:42:09)
  • Reasons for AI policy optimism (01:56:38)
  • Zvi’s day-to-day (02:09:47)
  • Big wins and losses on safety and alignment in 2023 (02:12:29)
  • Other unappreciated technical breakthroughs (02:17:54)
  • Concrete things we can do to mitigate risks (02:31:19)
  • Balsa Research and the Jones Act (02:34:40)
  • The National Environmental Policy Act (02:50:36)
  • Housing policy (02:59:59)
  • Underrated rationalist worldviews (03:16:22)

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour, Milo McGuire, and Dominic Armstrong
Transcriptions and additional content editing: Katy Moore

Jaksot(325)

#43 - Daniel Ellsberg on the institutional insanity that maintains nuclear doomsday machines

#43 - Daniel Ellsberg on the institutional insanity that maintains nuclear doomsday machines

In Stanley Kubrick’s iconic film Dr. Strangelove, the American president is informed that the Soviet Union has created a secret deterrence system which will automatically wipe out humanity upon detect...

25 Syys 20182h 44min

#42 - Amanda Askell on moral empathy, the value of information & the ethics of infinity

#42 - Amanda Askell on moral empathy, the value of information & the ethics of infinity

Consider two familiar moments at a family reunion. Our host, Uncle Bill, takes pride in his barbecuing skills. But his niece Becky says that she now refuses to eat meat. A groan goes round the table; ...

11 Syys 20182h 46min

#41 - David Roodman on incarceration, geomagnetic storms, & becoming a world-class researcher

#41 - David Roodman on incarceration, geomagnetic storms, & becoming a world-class researcher

With 698 inmates per 100,000 citizens, the U.S. is by far the leader among large wealthy nations in incarceration. But what effect does imprisonment actually have on crime? According to David Roodman...

28 Elo 20182h 18min

#40 - Katja Grace on forecasting future technology & how much we should trust expert predictions

#40 - Katja Grace on forecasting future technology & how much we should trust expert predictions

Experts believe that artificial intelligence will be better than humans at driving trucks by 2027, working in retail by 2031, writing bestselling books by 2049, and working as surgeons by 2053. But ho...

21 Elo 20182h 11min

#39 - Spencer Greenberg on the scientific approach to solving difficult everyday questions

#39 - Spencer Greenberg on the scientific approach to solving difficult everyday questions

Will Trump be re-elected? Will North Korea give up their nuclear weapons? Will your friend turn up to dinner? Spencer Greenberg, founder of ClearerThinking.org has a process for working out such real...

7 Elo 20182h 17min

#38 - Yew-Kwang Ng on anticipating effective altruism decades ago & how to make a much happier world

#38 - Yew-Kwang Ng on anticipating effective altruism decades ago & how to make a much happier world

Will people who think carefully about how to maximize welfare eventually converge on the same views? The effective altruism community has spent a lot of time over the past 10 years debating how best t...

26 Heinä 20181h 59min

#37 - GiveWell picks top charities by estimating the unknowable. James Snowden on how they do it.

#37 - GiveWell picks top charities by estimating the unknowable. James Snowden on how they do it.

What’s the value of preventing the death of a 5-year-old child, compared to a 20-year-old, or an 80-year-old? The global health community has generally regarded the value as proportional to the numbe...

16 Heinä 20181h 44min

#36 - Tanya Singh on ending the operations management bottleneck in effective altruism

#36 - Tanya Singh on ending the operations management bottleneck in effective altruism

Almost nobody is able to do groundbreaking physics research themselves, and by the time his brilliance was appreciated, Einstein was hardly limited by funding. But what if you could find a way to unlo...

11 Heinä 20182h 4min

Suosittua kategoriassa Koulutus

rss-murhan-anatomia
voi-hyvin-meditaatiot-2
rss-narsisti
adhd-podi
psykopodiaa-podcast
rss-rahamania
rss-valo-minussa-2
rss-uskonto-on-tylsaa
rss-niinku-asia-on
mielipaivakirja
rss-vapaudu-voimaasi
rss-duodecim-lehti
rahapuhetta
ilona-rauhala
aamukahvilla
kesken
dear-ladies
rss-eron-alkemiaa
rss-arkea-ja-aurinkoa-podcast-espanjasta
rss-koira-haudattuna