Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

In this episode of Gradient Dissent, Joseph E. Gonzalez, EECS Professor at UC Berkeley and Co-Founder at RunLLM, joins host Lukas Biewald to explore innovative approaches to evaluating LLMs.

They discuss the concept of vibes-based evaluation, which examines not just accuracy but also the style and tone of model responses, and how Chatbot Arena has become a community-driven benchmark for open-source and commercial LLMs. Joseph shares insights on democratizing model evaluation, refining AI-human interactions, and leveraging human preferences to improve model performance. This episode provides a deep dive into the evolving landscape of LLM evaluation and its impact on AI development.

🎙 Get our podcasts on these platforms:

Apple Podcasts: http://wandb.me/apple-podcasts

Spotify: http://wandb.me/spotify

Google: http://wandb.me/gd_google

YouTube: http://wandb.me/youtube

Follow Weights & Biases:

https://twitter.com/weights_biases

https://www.linkedin.com/company/wandb

Join the Weights & Biases Discord Server:

https://discord.gg/CkZKRNnaf3

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(136)

Anthony Goldbloom — How to Win Kaggle Competitions

Anthony Goldbloom — How to Win Kaggle Competitions

Anthony Goldbloom is the founder and CEO of Kaggle. In 2011 & 2012, Forbes Magazine named Anthony as one of the 30 under 30 in technology. In 2011, Fast Company featured him as one of the innovative t...

9 Syys 202044min

Suzana Ilić — Cultivating Machine Learning Communities

Suzana Ilić — Cultivating Machine Learning Communities

👩‍💻Today our guest is Suzanah Ilić!Suzanah is a founder of Machine Learning Tokyo which is a nonprofit organization dedicated to democratizing Machine Learning. They are a team of ML Engineers and R...

2 Syys 202034min

Jeremy Howard — The Story of fast.ai and Why Python Is Not the Future of ML

Jeremy Howard — The Story of fast.ai and Why Python Is Not the Future of ML

Jeremy Howard is a founding researcher at fast.ai, a research institute dedicated to making Deep Learning more accessible. Previously, he was the CEO and Founder at Enlitic, an advanced machine learni...

25 Elo 202051min

Anantha Kancherla — Building Level 5 Autonomous Vehicles

Anantha Kancherla — Building Level 5 Autonomous Vehicles

As Lyft’s VP of Engineering, Software at Level 5, Autonomous Vehicle Program, Anantha Kancherla has a birds-eye view on what it takes to make self-driving cars work in the real world. He previously wo...

12 Elo 202044min

Bharath Ramsundar — Deep Learning for Molecules and Medicine Discovery

Bharath Ramsundar — Deep Learning for Molecules and Medicine Discovery

Bharath created the deepchem.io open-source project to grow the deep drug discovery open source community, co-created the moleculenet.ai benchmark suite to facilitate development of molecular algorith...

5 Elo 202055min

Chip Huyen — ML Research and Production Pipelines

Chip Huyen — ML Research and Production Pipelines

Chip Huyen is a writer and computer scientist currently working at a startup that focuses on machine learning production pipelines. Previously, she’s worked at NVIDIA, Netflix, and Primer. She helped ...

29 Heinä 202043min

Peter Skomoroch — Product Management for AI

Peter Skomoroch — Product Management for AI

👨🏻‍💻Our guest on this episode of Gradient Dissent is Peter Skomoroch!Peter is the former head of data products at Workday and LinkedIn. Previously, he was the cofounder and CEO of venture-backed de...

21 Heinä 20201h 27min

Josh Tobin — Productionizing ML Models

Josh Tobin — Productionizing ML Models

Josh Tobin is a researcher working at the intersection of machine learning and robotics. His research focuses on applying deep reinforcement learning, generative models, and synthetic data to problems...

8 Heinä 202048min

Suosittua kategoriassa Liike-elämä ja talous

sijotuskasti
psykopodiaa-podcast
rss-rahapodi
rss-oivalluksia-rahasta-elamasta
mimmit-sijoittaa
rss-rahamania
rss-startup-ministerio
rss-sami-miettinen-neuvottelija
hyva-paha-johtaminen
asuntoasiaa-paivakirjat
ostan-asuntoja-podcast
rahapuhetta
pomojen-suusta
sijoituspodi
juristipodi
rss-uskalla-yrittaa
rss-lahtijat
rss-bisnesta-bebeja
rss-karon-grilli
rss-seuraava-potilas