Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

In this episode of Gradient Dissent, Joseph E. Gonzalez, EECS Professor at UC Berkeley and Co-Founder at RunLLM, joins host Lukas Biewald to explore innovative approaches to evaluating LLMs.

They discuss the concept of vibes-based evaluation, which examines not just accuracy but also the style and tone of model responses, and how Chatbot Arena has become a community-driven benchmark for open-source and commercial LLMs. Joseph shares insights on democratizing model evaluation, refining AI-human interactions, and leveraging human preferences to improve model performance. This episode provides a deep dive into the evolving landscape of LLM evaluation and its impact on AI development.

🎙 Get our podcasts on these platforms:

Apple Podcasts: http://wandb.me/apple-podcasts

Spotify: http://wandb.me/spotify

Google: http://wandb.me/gd_google

YouTube: http://wandb.me/youtube

Follow Weights & Biases:

https://twitter.com/weights_biases

https://www.linkedin.com/company/wandb

Join the Weights & Biases Discord Server:

https://discord.gg/CkZKRNnaf3

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(136)

Hamel Husain — Building Machine Learning Tools

Hamel Husain — Building Machine Learning Tools

Hamel Husain is a Staff Machine Learning Engineer at Github. He has extensive experience building data analytics and predictive modeling solutions for a wide range of industries, including: hospitalit...

24 Kesä 202036min

Peter Welinder — Deep Reinforcement Learning and Robotics

Peter Welinder — Deep Reinforcement Learning and Robotics

Peter Welinder is a research scientist and roboticist at OpenAI. Before that, he was an engineer at Dropbox and ran the machine learning team, and before that, he co-founded Anchovi Labs a startup usi...

17 Kesä 202054min

Vicki Boykis — Machine Learning Across Industries

Vicki Boykis — Machine Learning Across Industries

👩‍💻Today our guest is Vicki Boykis!Vicki is a senior consultant in machine learning and engineering and works with clients to build holistic data products used for decision-making. She's previously ...

4 Kesä 202034min

Angela & Danielle — Designing ML Models for Millions of Consumer Robots

Angela & Danielle — Designing ML Models for Millions of Consumer Robots

👩‍💻👩‍💻On this episode of Gradient Dissent our guests are Angela Bassa and Danielle Dean!Angela is an expert in building and leading data teams. An MIT-trained and Edelman-award-winning mathematici...

6 Touko 202052min

Jack Clark — Building Trustworthy AI Systems

Jack Clark — Building Trustworthy AI Systems

Jack Clark is the Strategy and Communications Director at OpenAI and formerly worked as the world’s only neural network reporter at Bloomberg. Lukas and Jack discuss AI policy, ethics, and the respons...

22 Huhti 202055min

Rachael Tatman — Conversational AI and Linguistics

Rachael Tatman — Conversational AI and Linguistics

🏅 See how W&B is your secret weapon to make it onto the Kaggle leaderboards - https://www.wandb.com/kaggle👩‍💻Rachael Tatman is a developer advocate for Rasa, where she helps developers build and de...

7 Huhti 202036min

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars

👨🏻‍💻Nicolas Koumchatzky is the Director of AI infrastructure at NVIDIA, where he's responsible for MagLev, the production-grade machine learning platform by NVIDIA. His team supports diverse ML use...

21 Maalis 202044min

Brandon Rohrer — Machine Learning in Production for Robots

Brandon Rohrer — Machine Learning in Production for Robots

👨🏻‍💻Brandon Rohrer is a Mechanical Engineer turned Data Scientist. He’s currently a Principal Data Scientist at iRobot and has an incredibly popular Machine Learning course at e2eML where he’s made...

11 Maalis 202034min

Suosittua kategoriassa Liike-elämä ja talous

sijotuskasti
psykopodiaa-podcast
rss-rahapodi
rss-oivalluksia-rahasta-elamasta
mimmit-sijoittaa
rss-rahamania
rss-startup-ministerio
rss-sami-miettinen-neuvottelija
hyva-paha-johtaminen
asuntoasiaa-paivakirjat
ostan-asuntoja-podcast
rahapuhetta
pomojen-suusta
sijoituspodi
juristipodi
rss-uskalla-yrittaa
rss-lahtijat
rss-bisnesta-bebeja
rss-karon-grilli
rss-seuraava-potilas