AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks. The complete show notes for this episode can be found at https://twimlai.com/go/704.

Episoder(781)

Towards Abstract Robotic Understanding with Raja Chatila - TWiML Talk #118

Towards Abstract Robotic Understanding with Raja Chatila - TWiML Talk #118

In this episode, we're joined by Raja Chatila, director of Intelligent Systems and Robotics at Pierre and Marie Curie University in Paris, and executive committee chair of the IEEE global initiative o...

12 Mar 201847min

Discovering Exoplanets w/ Deep Learning with Chris Shallue - TWiML Talk #117

Discovering Exoplanets w/ Deep Learning with Chris Shallue - TWiML Talk #117

Earlier this week, I had a chance to speak with Chris Shallue, Senior Software Engineer on the Google Brain Team, about his project and paper on “Exploring Exoplanets with Deep Learning.” This is a gr...

8 Mar 201845min

Learning Active Learning with Ksenia Konyushkova - TWiML Talk #116

Learning Active Learning with Ksenia Konyushkova - TWiML Talk #116

In this episode, I speak with Ksenia Konyushkova, Ph.D. student in the CVLab at Ecole Polytechnique Federale de Lausanne in Switzerland. Ksenia and I connected at NIPS in December to discuss her inter...

5 Mar 201831min

Machine Learning Platforms at Uber with Mike Del Balso - TWiML Talk #115

Machine Learning Platforms at Uber with Mike Del Balso - TWiML Talk #115

In this episode, I speak with Mike Del Balso, Product Manager for Machine Learning Platforms at Uber. Mike and I sat down last fall at the Georgian Partners Portfolio conference to discuss his present...

1 Mar 201849min

Inverse Programming for Deeper AI with Zenna Tavares - TWiML Talk #114

Inverse Programming for Deeper AI with Zenna Tavares - TWiML Talk #114

For today’s show, the final episode of our Black in AI Series, I’m joined by Zenna Tavares, a PhD student in the both the department of Brain and Cognitive Sciences and the Computer Science and Artifi...

26 Feb 201828min

Statistical Relational Artificial Intelligence with Sriraam Natarajan - TWiML Talk #113

Statistical Relational Artificial Intelligence with Sriraam Natarajan - TWiML Talk #113

In this episode, I speak with Sriraam Natarajan, Associate Professor in the Department of Computer Science at UT Dallas. While at NIPS a few months back, Sriraam and I sat down to discuss his work on ...

23 Feb 201847min

Classical Machine Learning for Infant Medical Diagnosis with Charles Onu - TWiML Talk #112

Classical Machine Learning for Infant Medical Diagnosis with Charles Onu - TWiML Talk #112

In this episode, part 4 in our Black in AI series, i'm joined by Charles Onu, Phd Student at McGill University in Montreal & Founder of Ubenwa, a startup tackling the problem of infant mortality due t...

20 Feb 201848min

Learning "Common Sense" and Physical Concepts with Roland Memisevic - TWiML Talk #111

Learning "Common Sense" and Physical Concepts with Roland Memisevic - TWiML Talk #111

In today’s episode, I’m joined by Roland Memisevic, co-founder, CEO, and chief scientist at Twenty Billion Neurons. Roland joined me at the RE•WORK Deep Learning Summit in Montreal to discuss the work...

15 Feb 201832min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
i-retten
stopp-verden
popradet
lydartikler-fra-aftenposten
rss-gukild-johaug
fotballpodden-2
det-store-bildet
dine-penger-pengeradet
nokon-ma-ga
rss-ness
hanna-de-heldige
aftenbla-bla
frokostshowet-pa-p5
rss-dannet-uten-piano
rss-penger-polser-og-politikk
e24-podden