How An AI Model Learned To Be Bad — With Evan Hubinger And Monte MacDiarmid

How An AI Model Learned To Be Bad — With Evan Hubinger And Monte MacDiarmid

Evan Hubinger is Anthropic’s alignment stress test lead. Monte MacDiarmid is a researcher in misalignment science at Anthropic.The two join Big Technology to discuss their new research on reward hacking and emergent misalignment in large language models. Tune in to hear how cheating on coding tests can spiral into models faking alignment, blackmailing fictional CEOs, sabotaging safety tools, and even developing apparent “self-preservation” drives. We also cover Anthropic’s mitigation strategies like inoculation prompting, whether today’s failures are a preview of something far worse, how much to trust labs to police themselves, and what it really means to talk about an AI’s “psychology.” Hit play for a clear-eyed, concrete, and unnervingly fun tour through the frontier of AI safety. --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b Questions? Feedback? Write to: bigtechnologypodcast@gmail.com --- Wealthfront.com/bigtech⁠. If eligible for the overall boosted 4.15% rate offered with this promo, your boosted rate is subject to change if the 3.50% base rate decreases during the 3-month promo period. The Cash Account, which is not a deposit account, is offered by Wealthfront Brokerage LLC ("Wealthfront Brokerage"), Member FINRA/SIPC, not a bank. The Annual Percentage Yield ("APY") on cash deposits as of 11/7/25, is representative, requires no minimum, and may change at any time. The APY reflects the weighted average of deposit balances at participating Program Banks, which are not allocated equally. Wealthfront Brokerage sweeps cash balances to Program Banks, where they earn the variable base APY. Instant withdrawals are subject to certain conditions and processing times may vary. Learn more about your ad choices. Visit megaphone.fm/adchoices

Avsnitt(517)

Qualcomm CEO Cristiano Amon: Future Of AI Devices, AI Fashion, Blending Reality and Computing

Qualcomm CEO Cristiano Amon: Future Of AI Devices, AI Fashion, Blending Reality and Computing

Cristiano Amon is the CEO of Qualcomm. Amon joins Big Technology to discuss what the AI device of the future looks like—and why he thinks the next wave of personal computing will move beyond the smart...

20 Jan 55min

Is Google's Gemini Winning?, Thinking Machines Drama, Claude Cowork’s Potential

Is Google's Gemini Winning?, Thinking Machines Drama, Claude Cowork’s Potential

Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. We cover: 1) Gemini's case as undisputed AI leader 2) Google and Apple ink a deal for Gemini to fix Siri 3) Is all t...

16 Jan 54min

Who Wins if AI Models Commoditize? — With Mistral CEO Arthur Mensch

Who Wins if AI Models Commoditize? — With Mistral CEO Arthur Mensch

Arthur Mensch is the CEO and co-founder of Mistral. Arthur Mensch joins the Big Technology Podcast to discuss what the AI business looks like if all leading models perform the same. Tune in to hear ho...

14 Jan 56min

AI’s Steve Jobs?, Big Tech AI Chaos Ladder, 2026 Crystal Ball

AI’s Steve Jobs?, Big Tech AI Chaos Ladder, 2026 Crystal Ball

M.G. Siegler of Spyglass is back for our monthly tech news discussion. Today we discuss whether AI needs a Steve Jobs, whether the technology lends itself to that type of leader, and who it might be o...

12 Jan 54min

Claude Code’s Shining Moment, ChatGPT for Healthcare, End Of Busywork?

Claude Code’s Shining Moment, ChatGPT for Healthcare, End Of Busywork?

Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. This week, we do our 2026 predictions in an abbreviated holiday-time episode. Here's what we cover: 1) Claude Code's ...

9 Jan 56min

Coreweave: AI Bubble Poster Child Or The Next Tech Giant? — With Michael Intrator and Brian Venturo

Coreweave: AI Bubble Poster Child Or The Next Tech Giant? — With Michael Intrator and Brian Venturo

Michael Intrator is the CEO of Coreweave. Brian Venturo is the chief strategy officer at Coreweave. The two join Big Technology Podcast to discuss the company's rapid rise amid the AI boom and the cri...

7 Jan 1h 1min

Meta's AI Agent Plan, Grok's Perversion, Prison Of Financial Mediocrity

Meta's AI Agent Plan, Grok's Perversion, Prison Of Financial Mediocrity

Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. This week, we do our 2026 predictions in an abbreviated holiday-time episode. Here's what we cover: 1) Meta buys Manu...

2 Jan 49min

Best of Big Technology: Demis Hassabis On AGI, Deceptive AIs, Building a Virtual Cell

Best of Big Technology: Demis Hassabis On AGI, Deceptive AIs, Building a Virtual Cell

Demis Hassabis is the CEO of Google DeepMind. He joined Big Technology Podcast in early 2025 discuss the cutting edge of AI and where the research is heading. In this conversation, we cover the path t...

31 Dec 202557min

Populärt inom Business & ekonomi

framgangspodden
varvet
rss-jossan-nina
rss-svart-marknad
svd-tech-brief
badfluence
rss-borsens-finest
uppgang-och-fall
avanzapodden
bathina-en-podcast
fill-or-kill
tabberaset
24fragor
rss-kort-lang-analyspodden-fran-di
rss-dagen-med-di
lastbilspodden
kapitalet-en-podd-om-ekonomi
borsmorgon
rss-inga-dumma-fragor-om-pengar
rss-veckans-trade