Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

In this episode of Gradient Dissent, Joseph E. Gonzalez, EECS Professor at UC Berkeley and Co-Founder at RunLLM, joins host Lukas Biewald to explore innovative approaches to evaluating LLMs.

They discuss the concept of vibes-based evaluation, which examines not just accuracy but also the style and tone of model responses, and how Chatbot Arena has become a community-driven benchmark for open-source and commercial LLMs. Joseph shares insights on democratizing model evaluation, refining AI-human interactions, and leveraging human preferences to improve model performance. This episode provides a deep dive into the evolving landscape of LLM evaluation and its impact on AI development.

🎙 Get our podcasts on these platforms:

Apple Podcasts: http://wandb.me/apple-podcasts

Spotify: http://wandb.me/spotify

Google: http://wandb.me/gd_google

YouTube: http://wandb.me/youtube

Follow Weights & Biases:

https://twitter.com/weights_biases

https://www.linkedin.com/company/wandb

Join the Weights & Biases Discord Server:

https://discord.gg/CkZKRNnaf3

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(136)

Advanced AI Accelerators and Processors with Andrew Feldman of Cerebras Systems

Advanced AI Accelerators and Processors with Andrew Feldman of Cerebras Systems

On this episode, we’re joined by Andrew Feldman, Founder and CEO of Cerebras Systems. Andrew and the Cerebras team are responsible for building the largest-ever computer chip and the fastest AI-specif...

22 Kesä 20231h

Enabling LLM-Powered Applications with Harrison Chase of LangChain

Enabling LLM-Powered Applications with Harrison Chase of LangChain

On this episode, we’re joined by Harrison Chase, Co-Founder and CEO of LangChain. Harrison and his team at LangChain are on a mission to make the process of creating applications powered by LLMs as ea...

1 Kesä 202351min

Deploying Autonomous Mobile Robots with Jean Marc Alkazzi at idealworks

Deploying Autonomous Mobile Robots with Jean Marc Alkazzi at idealworks

On this episode, we’re joined by Jean Marc Alkazzi, Applied AI at idealworks. Jean focuses his attention on applied AI, leveraging the use of autonomous mobile robots (AMRs) to improve efficiency with...

18 Touko 202358min

How EleutherAI Trains and Releases LLMs: Interview with Stella Biderman

How EleutherAI Trains and Releases LLMs: Interview with Stella Biderman

On this episode, we’re joined by Stella Biderman, Executive Director at EleutherAI and Lead Scientist - Mathematician at Booz Allen Hamilton.EleutherAI is a grassroots collective that enables open-sou...

4 Touko 202357min

Scaling LLMs and Accelerating Adoption with Aidan Gomez at Cohere

Scaling LLMs and Accelerating Adoption with Aidan Gomez at Cohere

On this episode, we’re joined by Aidan Gomez, Co-Founder and CEO at Cohere. Cohere develops and releases a range of innovative AI-powered tools and solutions for a variety of NLP use cases.We discuss:...

20 Huhti 202351min

Neural Network Pruning and Training with Jonathan Frankle at MosaicML

Neural Network Pruning and Training with Jonathan Frankle at MosaicML

Jonathan Frankle, Chief Scientist at MosaicML and Assistant Professor of Computer Science at Harvard University, joins us on this episode. With comprehensive infrastructure and software tools, MosaicM...

4 Huhti 20231h 2min

Jasper AI's Dave Rogenmoser & Saad Ansari on Growing & Maintaining an LLM-Based Company

Jasper AI's Dave Rogenmoser & Saad Ansari on Growing & Maintaining an LLM-Based Company

About this episodeIn this episode of Gradient Dissent, Lukas interviews Dave Rogenmoser (CEO & Co-Founder) and Saad Ansari (Director of AI) of Jasper AI, a generative AI company with a focus on text g...

16 Maalis 20231h 9min

Shreya Shankar — Operationalizing Machine Learning

Shreya Shankar — Operationalizing Machine Learning

About This EpisodeShreya Shankar is a computer scientist, PhD student in databases at UC Berkeley, and co-author of "Operationalizing Machine Learning: An Interview Study", an ethnographic interview s...

3 Maalis 202354min

Suosittua kategoriassa Liike-elämä ja talous

sijotuskasti
psykopodiaa-podcast
rss-rahapodi
rss-oivalluksia-rahasta-elamasta
mimmit-sijoittaa
rss-rahamania
rss-startup-ministerio
rss-sami-miettinen-neuvottelija
hyva-paha-johtaminen
asuntoasiaa-paivakirjat
ostan-asuntoja-podcast
rahapuhetta
pomojen-suusta
sijoituspodi
juristipodi
rss-uskalla-yrittaa
rss-lahtijat
rss-bisnesta-bebeja
rss-karon-grilli
rss-seuraava-potilas