AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

AI Agents: Substance or Snake Oil with Arvind Narayanan - #704

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter and AI Snake Oil. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks. The complete show notes for this episode can be found at https://twimlai.com/go/704.

Episoder(779)

Security and Safety in AI: Adversarial Examples, Bias and Trust w/ Moustapha Cissé - TWiML Talk #108

Security and Safety in AI: Adversarial Examples, Bias and Trust w/ Moustapha Cissé - TWiML Talk #108

In this episode I’m joined by Moustapha Cissé, Research Scientist at Facebook AI Research Lab (or FAIR) Paris. Moustapha’s broad research interests include the security and safety of AI systems, and w...

6 Feb 201850min

Peering into the Home w/ Aerial.ai's Wifi Motion Analytics - TWiML Talk #107

Peering into the Home w/ Aerial.ai's Wifi Motion Analytics - TWiML Talk #107

In this episode I’m joined by Michel Allegue and Negar Ghourchian of Aerial.ai. Aerial is doing some really interesting things in the home automation space, by using wifi signal statistics to identify...

2 Feb 201840min

Physiology-Based Models for Fitness and Training w/ Firstbeat with Ilkka Korhonen - TWiML Talk #106

Physiology-Based Models for Fitness and Training w/ Firstbeat with Ilkka Korhonen - TWiML Talk #106

In this episode i'm joined by Ilkka Korhonen, Vice President of Technology at Firstbeat, a company whose algorithms are embedded in fitness watches from companies like Garmin and Suunto and which use ...

2 Feb 201835min

Machine Learning for Signal Processing Applications w/ Stuart Feffer & Brady Tsai - TWiML Talk #105

Machine Learning for Signal Processing Applications w/ Stuart Feffer & Brady Tsai - TWiML Talk #105

In this episode, I'm joined by Stuart Feffer, co-founder and CEO of Reality AI, which provides tools and services for engineers working with sensors and signals, and Brady Tsai, Business Development M...

1 Feb 201836min

Personalizing the Ferrari Challenge Experience w/ Intel AI - TWiML Talk #104

Personalizing the Ferrari Challenge Experience w/ Intel AI - TWiML Talk #104

In this episode, I'm joined by Andy Keller and Emile Chin-Dickey to discuss Intel's partnership with the Ferrari Challenge North American Series. Andy is a Deep Learning Data Scientist at Intel and Em...

31 Jan 201837min

Deep Learning for 3D Sensors and Cameras in Lighthouse with Alex Teichman - TWiML Talk #103

Deep Learning for 3D Sensors and Cameras in Lighthouse with Alex Teichman - TWiML Talk #103

In this episode, I sit down with Alex Teichman, CEO and Co-Founder of Lighthouse, a company taking a new approach to the in-home smart camera. Alex and I dig into what exactly the Lighthouse product i...

30 Jan 201842min

Computer Vision for Cozmo, the Cutest Toy Robot Everrrrr! with Andrew Stein - TWiML Talk #102

Computer Vision for Cozmo, the Cutest Toy Robot Everrrrr! with Andrew Stein - TWiML Talk #102

In this episode, I'm joined by Andrew Stein, computer vision engineer at consumer robotics company Anki, and his partner in crime Cozmo, a toy robot with tons of personality. Andrew joined me during t...

30 Jan 201843min

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY! w/ Inmar Givoni - Talk #101

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY! w/ Inmar Givoni - Talk #101

In this episode i'm joined by Inmar Givoni, Autonomy Engineering Manager at Uber ATG, to discuss her work on the paper Min-Max Propagation, which was presented at NIPS last month in Long Beach. Inmar ...

26 Jan 201848min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden-usa
aftenpodden
stopp-verden
forklart
i-retten
popradet
det-store-bildet
fotballpodden-2
nokon-ma-ga
dine-penger-pengeradet
rss-gukild-johaug
bt-dokumentar-2
aftenbla-bla
hanna-de-heldige
rss-ness
frokostshowet-pa-p5
rss-penger-polser-og-politikk
rss-dannet-uten-piano
e24-podden