From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

Today, we're joined by Mahesh Sathiamoorthy, co-founder and CEO of Bespoke Labs, to discuss how reinforcement learning (RL) is reshaping the way we build custom agents on top of foundation models. Mahesh highlights the crucial role of data curation, evaluation, and error analysis in model performance, and explains why RL offers a more robust alternative to prompting, and how it can improve multi-step tool use capabilities. We also explore the limitations of supervised fine-tuning (SFT) for tool-augmented reasoning tasks, the reward-shaping strategies they’ve used, and Bespoke Labs’ open-source libraries like Curator. We also touch on the models MiniCheck for hallucination detection and MiniChart for chart-based QA. The complete show notes for this episode can be found at https://twimlai.com/go/731.

Jaksot(779)

Carlos Guestrin - Explaining the Predictions of Machine Learning Models - TWiML Talk #7

Carlos Guestrin - Explaining the Predictions of Machine Learning Models - TWiML Talk #7

My guest this time is Carlos Guestrin, the Amazon professor of Machine Learning at the University of Washington. Carlos and I recorded this podcast at a conference, shortly after Apple's acquisition o...

9 Loka 201631min

Angie Hugeback - Generating Training Data for Your ML Models - TWiML Talk #6

Angie Hugeback - Generating Training Data for Your ML Models - TWiML Talk #6

My guest this time is Angie Hugeback, who is principal data scientist at Spare5. Spare5 helps customers generate the high-quality labeled training datasets that are so crucial to developing accurate m...

29 Syys 20161h 1min

Joshua Bloom - Machine Learning for the Stars & Productizing AI - TWiML Talk #5

Joshua Bloom - Machine Learning for the Stars & Productizing AI - TWiML Talk #5

My guest this time is Joshua Bloom. Josh is professor of astronomy at the University of California, Berkeley and co-founder and Chief Technology Officer of machine learning startup Wise.io. In this wi...

22 Syys 20161h 28min

Charles Isbell - Interactive AI, Plus Improving ML Education - TWiML Talk #4

Charles Isbell - Interactive AI, Plus Improving ML Education - TWiML Talk #4

My guest this time is Charles Isbell, Jr., Professor and Senior Associate Dean in the College of Computing at Georgia Institute of Technology. Charles and I go back a bit… in fact he’s the first AI re...

10 Syys 20161h 4min

Xavier Amatriain - Engineering Practical Machine Learning Systems - TWiML Talk #3

Xavier Amatriain - Engineering Practical Machine Learning Systems - TWiML Talk #3

My guest this time is Xavier Amatriain. Xavier is a former researcher who went on to lead the machine learning recommendations team at Netflix, and is now the vice president of engineering at Quora, t...

28 Elo 201656min

Siraj Raval - How to Build Confidence as an ML Developer - TWiML Talk #2

Siraj Raval - How to Build Confidence as an ML Developer - TWiML Talk #2

Siraj Raval is a machine learning hacker and teacher whose machine learning for hackers and fresh machine learning youtube series are fun, informative, high energy and practical ways to learn about a ...

21 Elo 201640min

This Week in ML & AI – 8/12/16: Another huge machine learning acquisition + AI in the Olympics

This Week in ML & AI – 8/12/16: Another huge machine learning acquisition + AI in the Olympics

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week we discuss Intel’s latest deep...

15 Elo 201623min

This Week in ML & AI – 8/5/16: Apple Acquires Turi, the DARPA Hacker-Bot Challenge and More

This Week in ML & AI – 8/5/16: Apple Acquires Turi, the DARPA Hacker-Bot Challenge and More

This Week in Machine Learning & AI brings you the week’s most interesting and important stories from the world of machine learning and artificial intelligence. This week we look at Apple’s acquisition...

6 Elo 201624min

Suosittua kategoriassa Politiikka ja uutiset

aikalisa
rss-ootsa-kuullut-tasta
tervo-halme
ootsa-kuullut-tasta-2
politiikan-puskaradio
viisupodi
et-sa-noin-voi-sanoo-esittaa
otetaan-yhdet
rss-podme-livebox
rss-vaalirankkurit-podcast
rss-asiastudio
the-ulkopolitist
rss-kaikki-uusiksi
rss-tekkipodi
io-techin-tekniikkapodcast
rikosmyytit
rss-mina-ukkola
rss-fingo-podcast
rss-hyvaa-huomenta-bryssel
rss-merja-mahkan-rahat