AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks. The complete show notes for this episode can be found at https://twimlai.com/go/704.

Episoder(783)

Edutainment for AI and AWS PartyRock with Mike Miller - #661

Edutainment for AI and AWS PartyRock with Mike Miller - #661

Today we’re joined by Mike Miller, director of product at AWS responsible for the company’s “edutainment” products. In our conversation with Mike, we explore AWS PartyRock, a no-code generative AI app...

18 Des 202329min

Data, Systems and ML for Visual Understanding with Cody Coleman - #660

Data, Systems and ML for Visual Understanding with Cody Coleman - #660

Today we’re joined by Cody Coleman, co-founder and CEO of Coactive AI. In our conversation with Cody, we discuss how Coactive has leveraged modern data, systems, and machine learning techniques to del...

14 Des 202338min

Patterns and Middleware for LLM Applications with Kyle Roche - #659

Patterns and Middleware for LLM Applications with Kyle Roche - #659

Today we’re joined by Kyle Roche, founder and CEO of Griptape to discuss patterns and middleware for LLM applications. We dive into the emerging patterns for developing LLM applications, such as off p...

11 Des 202335min

AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658

AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658

Today we’re joined by Prem Natarajan, chief scientist and head of enterprise AI at Capital One. In our conversation, we discuss AI access and inclusivity as technical challenges and explore some of Pr...

4 Des 202341min

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657

Today we’re joined by Jay Emery, director of technical sales & architecture at Microsoft Azure. In our conversation with Jay, we discuss the challenges faced by organizations when building LLM-based a...

28 Nov 202343min

Visual Generative AI Ecosystem Challenges with Richard Zhang - #656

Visual Generative AI Ecosystem Challenges with Richard Zhang - #656

Today we’re joined by Richard Zhang, senior research scientist at Adobe Research. In our conversation with Richard, we explore the research challenges that arise when regarding visual generative AI fr...

20 Nov 202340min

Deploying Edge and Embedded AI Systems with Heather Gorr - #655

Deploying Edge and Embedded AI Systems with Heather Gorr - #655

Today we’re joined by Heather Gorr, principal MATLAB product marketing manager at MathWorks. In our conversation with Heather, we discuss the deployment of AI models to hardware devices and embedded A...

13 Nov 202338min

AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654

AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654

Today we’re joined by Yoshua Bengio, professor at Université de Montréal. In our conversation with Yoshua, we discuss AI safety and the potentially catastrophic risks of its misuse. Yoshua highlights ...

6 Nov 202348min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
stopp-verden
popradet
dine-penger-pengeradet
rss-gukild-johaug
det-store-bildet
nokon-ma-ga
lydartikler-fra-aftenposten
hanna-de-heldige
fotballpodden-2
rss-ness
aftenbla-bla
rss-espen-lee-usensurert
e24-podden
rss-dannet-uten-piano
rss-utenrikskomiteen-med-bogen-og-grasvik
rss-penger-polser-og-politikk