AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks. The complete show notes for this episode can be found at https://twimlai.com/go/704.

Avsnitt(781)

Applied Machine Learning for Publishers with Naveed Ahmad - TWiML Talk #182

Applied Machine Learning for Publishers with Naveed Ahmad - TWiML Talk #182

In today’s episode we’re joined by Naveed Ahmad, Senior Director of data engineering and machine learning at Hearst Newspapers. In our conversation, we discuss into the role of ML at Hearst, including...

20 Sep 201839min

Anticipating Superintelligence with Nick Bostrom - TWiML Talk #181

Anticipating Superintelligence with Nick Bostrom - TWiML Talk #181

In this episode, we’re joined by Nick Bostrom, professor at the University of Oxford and head of the Future of Humanity Institute, a multidisciplinary institute focused on answering big-picture questi...

17 Sep 201844min

Can We Train an AI to Understand Body Language? with Hanbyul Joo - TWIML Talk #180

Can We Train an AI to Understand Body Language? with Hanbyul Joo - TWIML Talk #180

In this episode, we’re joined by Hanbyul Joo, a PhD student at CMU. Han is working on what is called the “Panoptic Studio,” a multi-dimension motion capture studio used to capture human body behavio...

13 Sep 201851min

Biological Particle Identification and Tracking with Jay Newby - TWiML Talk #179

Biological Particle Identification and Tracking with Jay Newby - TWiML Talk #179

In today’s episode we’re joined by Jay Newby, Assistant Professor in the Department of Mathematical and Statistical Sciences at the University of Alberta. Jay joins us to discuss his work applying d...

10 Sep 201845min

AI for Content Creation with Debajyoti Ray - TWiML Talk #178

AI for Content Creation with Debajyoti Ray - TWiML Talk #178

In today’s episode we’re joined by Debajyoti Ray, Founder and CEO of RivetAI, a startup producing AI-powered tools for storytellers and filmmakers. Deb and I discuss some of what he’s learned in the ...

6 Sep 201855min

Deep Reinforcement Learning Primer and Research Frontiers with Kamyar Azizzadenesheli - TWiML Talk #177

Deep Reinforcement Learning Primer and Research Frontiers with Kamyar Azizzadenesheli - TWiML Talk #177

Today we’re joined by Kamyar Azizzadenesheli, PhD student at the University of California, Irvine, who joins us to review the core elements of RL, along with a pair of his RL-related papers: “Efficien...

30 Aug 20181h 34min

OpenAI Five with Christy Dennison - TWiML Talk #176

OpenAI Five with Christy Dennison - TWiML Talk #176

Today we’re joined by Christy Dennison, Machine Learning Engineer at OpenAI, who has been working on OpenAI’s efforts to build an AI-powered agent to play the DOTA 2 video game. In our conversation we...

27 Aug 201848min

How ML Keeps Shelves Stocked at Home Depot with Pat Woowong - TWiML Talk #175

How ML Keeps Shelves Stocked at Home Depot with Pat Woowong - TWiML Talk #175

Today we’re joined by Pat Woowong, principal engineer in the applied machine intelligence group at The Home Depot. We discuss a project that Pat recently presented at the Google Cloud Next conferenc...

23 Aug 201845min

Populärt inom Politik & nyheter

svenska-fall
aftonbladet-krim
p3-krim
rss-krimstad
fordomspodden
flashback-forever
rss-expressen-dok
motiv
aftonbladet-daily
spar
blenda-2
rss-sanning-konsekvens
svd-ledarredaktionen
rss-vad-fan-hande
olyckan-inifran
rss-krimreportrarna
dagens-eko
rss-frandfors-horna
kungligt
svd-nyhetsartiklar