Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

In this episode of Gradient Dissent, Joseph E. Gonzalez, EECS Professor at UC Berkeley and Co-Founder at RunLLM, joins host Lukas Biewald to explore innovative approaches to evaluating LLMs.

They discuss the concept of vibes-based evaluation, which examines not just accuracy but also the style and tone of model responses, and how Chatbot Arena has become a community-driven benchmark for open-source and commercial LLMs. Joseph shares insights on democratizing model evaluation, refining AI-human interactions, and leveraging human preferences to improve model performance. This episode provides a deep dive into the evolving landscape of LLM evaluation and its impact on AI development.

🎙 Get our podcasts on these platforms:

Apple Podcasts: http://wandb.me/apple-podcasts

Spotify: http://wandb.me/spotify

Google: http://wandb.me/gd_google

YouTube: http://wandb.me/youtube


Follow Weights & Biases:

https://twitter.com/weights_biases

https://www.linkedin.com/company/wandb


Join the Weights & Biases Discord Server:

https://discord.gg/CkZKRNnaf3

Avsnitt(134)

AI’s Future: Investment & Impact with Sarah Guo and Elad Gil

AI’s Future: Investment & Impact with Sarah Guo and Elad Gil

Explore the Future of Investment & Impact in AI with Host Lukas Biewald and Guests Elad Gill and Sarah Guo of the No Priors podcast.Sarah is the founder of Conviction VC, an AI-centric $100 million ve...

18 Jan 20241h 4min

Revolutionizing AI Data Management with Jerry Liu, CEO of LlamaIndex

Revolutionizing AI Data Management with Jerry Liu, CEO of LlamaIndex

In the latest episode of Gradient Dissent, we explore the innovative features and impact of LlamaIndex in AI data management with Jerry Liu, CEO of LlamaIndex. Jerry shares insights on how LlamaIndex ...

4 Jan 202457min

Bridging AI and Science: The Impact of Machine Learning on Material Innovation with Joe Spisak of Meta

Bridging AI and Science: The Impact of Machine Learning on Material Innovation with Joe Spisak of Meta

In the latest episode of Gradient Dissent, we hear from Joseph Spisak, Product Director, Generative AI @Meta, to explore the boundless impacts of AI and its expansive role in reshaping various sectors...

7 Dec 20231h 14min

Unlocking the Power of Language Models in Enterprise: A Deep Dive with Chris Van Pelt

Unlocking the Power of Language Models in Enterprise: A Deep Dive with Chris Van Pelt

In the premiere episode of Gradient Dissent Business, we're joined by Weights & Biases co-founder Chris Van Pelt for a deep dive into the world of large language models like GPT-3.5 and GPT-4. Chris b...

16 Nov 202352min

Providing Greater Access to LLMs with Brandon Duderstadt, Co-Founder and CEO of Nomic AI

Providing Greater Access to LLMs with Brandon Duderstadt, Co-Founder and CEO of Nomic AI

On this episode, we’re joined by Brandon Duderstadt, Co-Founder and CEO of Nomic AI. Both of Nomic AI’s products, Atlas and GPT4All, aim to improve the explainability and accessibility of AI.We discus...

27 Juli 20231h 1min

Exploring PyTorch and Open-Source Communities with Soumith Chintala, VP/Fellow of Meta, Co-Creator of PyTorch

Exploring PyTorch and Open-Source Communities with Soumith Chintala, VP/Fellow of Meta, Co-Creator of PyTorch

On this episode, we’re joined by Soumith Chintala, VP/Fellow of Meta and Co-Creator of PyTorch. Soumith and his colleagues’ open-source framework impacted both the development process and the end-user...

13 Juli 20231h 8min

Advanced AI Accelerators and Processors with Andrew Feldman of Cerebras Systems

Advanced AI Accelerators and Processors with Andrew Feldman of Cerebras Systems

On this episode, we’re joined by Andrew Feldman, Founder and CEO of Cerebras Systems. Andrew and the Cerebras team are responsible for building the largest-ever computer chip and the fastest AI-specif...

22 Juni 20231h

Enabling LLM-Powered Applications with Harrison Chase of LangChain

Enabling LLM-Powered Applications with Harrison Chase of LangChain

On this episode, we’re joined by Harrison Chase, Co-Founder and CEO of LangChain. Harrison and his team at LangChain are on a mission to make the process of creating applications powered by LLMs as ea...

1 Juni 202351min

Populärt inom Business & ekonomi

framgangspodden
varvet
badfluence
rss-jossan-nina
rss-borsens-finest
avanzapodden
svd-tech-brief
rss-svart-marknad
uppgang-och-fall
fill-or-kill
rss-dagen-med-di
borsmorgon
kapitalet-en-podd-om-ekonomi
affarsvarlden
rss-kort-lang-analyspodden-fran-di
tabberaset
lastbilspodden
24fragor
bathina-en-podcast
borslunch-2