AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks. The complete show notes for this episode can be found at https://twimlai.com/go/704.

Jaksot(779)

Bridging the Gap Between Academic and Industry Careers with Ross Fadely - TWiML Talk #68

Bridging the Gap Between Academic and Industry Careers with Ross Fadely - TWiML Talk #68

We close out our NYU Future Labs AI Summit interview series with Ross Fadely, a New York based AI lead with Insight Data Science. Insight is an interesting company offering a free seven week post-doct...

16 Marras 201719min

The Limitations of Human-in-the-Loop AI with Dennis Mortensen - TWiML Talk #67

The Limitations of Human-in-the-Loop AI with Dennis Mortensen - TWiML Talk #67

We continue our NYU Future Labs AI Summit interview series with Dennis Mortensen, founder and CEO of X.ai, a company whose AI-based personal assistant Amy helps users with scheduling meetings. I caugh...

13 Marras 201735min

Nexus Lab Cohort 2 - Second Mind - TWiML Talk #66

Nexus Lab Cohort 2 - Second Mind - TWiML Talk #66

The podcast you’re about to hear is the fourth of a series of shows recorded at the NYU Future Labs AI Summit last week in New York City. In this show, I speak with Kul Singh, CEO and Founder of Secon...

9 Marras 201721min

Nexus Lab Cohort 2 - Bite.ai - TWiML Talk #65

Nexus Lab Cohort 2 - Bite.ai - TWiML Talk #65

The podcast you’re about to hear is the second of a series of shows recorded at the NYU Future Labs AI Summit last week in New York City.In this episode, you’ll hear from Bite.ai, a startup founded by...

8 Marras 201726min

Nexus Lab Cohort 2 - Bowtie - TWiML Talk #64

Nexus Lab Cohort 2 - Bowtie - TWiML Talk #64

The podcast you’re about to hear is the second of a series of shows recorded at the NYU Future Labs AI Summit last week in New York City. In this episode, I speak with Ron Fisher and Mike Wang, who, a...

7 Marras 201725min

AI Nexus Lab Cohort 2 - Mt. Cleverest - TWiML Talk #63

AI Nexus Lab Cohort 2 - Mt. Cleverest - TWiML Talk #63

The podcast you’re about to hear is the first of a series of shows recorded at the NYU Future Labs AI Summit last week in New York City. My guests this time around are James Villarrubia and Bernie Pra...

6 Marras 201732min

Learning to Learn, and other Opportunities in Machine Learning with Graham Taylor - TWiML Talk #62

Learning to Learn, and other Opportunities in Machine Learning with Graham Taylor - TWiML Talk #62

The podcast you’re about to hear is the third of a series of shows recorded at the Georgian Partners Portfolio Conference last week in Toronto. My guest this time is Graham Taylor, professor of engine...

3 Marras 201737min

Building Conversational Application for Financial Services with Kenneth Conroy - TWiML Talk #61

Building Conversational Application for Financial Services with Kenneth Conroy - TWiML Talk #61

The podcast you’re about to hear is the second of a series of shows recorded at the Georgian Partners Portfolio Conference last week in Toronto. My guest for this interview is Kenneth Conroy, VP of da...

1 Marras 201737min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
rss-ootsa-kuullut-tasta
tervo-halme
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
rss-vaalirankkurit-podcast
rss-asiastudio
rss-podme-livebox
the-ulkopolitist
rss-kaikki-uusiksi
rss-tekkipodi
io-techin-tekniikkapodcast
rikosmyytit
rss-mina-ukkola
rss-fingo-podcast
rss-hyvaa-huomenta-bryssel
rss-kuka-mina-olen