The Startup Powering The Data Behind AGI

The Startup Powering The Data Behind AGI

In this episode of Gradient Dissent, Lukas Biewald talks with the CEO & founder of Surge AI, the billion-dollar company quietly powering the next generation of frontier LLMs. They discuss Surge's origin story, why traditional data labeling is broken, and how their research-focused approach is reshaping how models are trained.

You’ll hear why inter-annotator agreement fails in high-complexity tasks like poetry and math, why synthetic data is often overrated, and how Surge builds rich RL environments to stress-test agentic reasoning. They also go deep on what kinds of data will be critical to future progress in AI—from scientific discovery to multimodal reasoning and personalized alignment.


It’s a rare, behind-the-scenes look into the world of high-quality data generation at scale—straight from the team most frontier labs trust to get it right.


Timestamps:

00:00 – Intro: Who is Edwin Chen?

03:40 – The problem with early data labeling systems

06:20 – Search ranking, clickbait, and product principles

10:05 – Why Surge focused on high-skill, high-quality labeling

13:50 – From Craigslist workers to a billion-dollar business

16:40 – Scaling without funding and avoiding Silicon Valley status games

21:15 – Why most human data platforms lack real tech

25:05 – Detecting cheaters, liars, and low-quality labelers

28:30 – Why inter-annotator agreement is a flawed metric

32:15 – What makes a great poem? Not checkboxes

36:40 – Measuring subjective quality rigorously

40:00 – What types of data are becoming more important

44:15 – Scientific collaboration and frontier research data

47:00 – Multimodal data, Argentinian coding, and hyper-specificity

50:10 – What's wrong with LMSYS and benchmark hacking

53:20 – Personalization and taste in model behavior

56:00 – Synthetic data vs. high-quality human data


Follow Weights & Biases:

https://twitter.com/weights_biases

https://www.linkedin.com/company/wandb

Jaksot(128)

Shaping the World of Robotics with Chelsea Finn

Shaping the World of Robotics with Chelsea Finn

In the newest episode of Gradient Dissent, Chelsea Finn, Assistant Professor at Stanford's Computer Science Department, discusses the forefront of robotics and machine learning.Discover her groundbreaking work, where two-armed robots learn to cook shrimp (messes included!), and discuss how robotic learning could transform student feedback in education.We'll dive into the challenges of developing humanoid and quadruped robots, explore the limitations of simulated environments and discuss why real-world experience is key for adaptable machines. Plus, Chelsea will offer a glimpse into the future of household robotics and why it may be a few years before a robot is making your bed.Whether you're an AI enthusiast, a robotics professional, or simply curious about the potential and future of the technology, this episode offers unique insights into the evolving world of robotics and where it's headed next.*Subscribe to Weights & Biases* → https://bit.ly/45BCkYz🎙 Get our podcasts on these platforms:Apple Podcasts: http://wandb.me/apple-podcastsSpotify: http://wandb.me/spotifyGoogle: http://wandb.me/gd_googleYouTube: http://wandb.me/youtubeConnect with Chelsea Finn:https://www.linkedin.com/in/cbfinn/ https://twitter.com/chelseabfinnFollow Weights & Biases:https://twitter.com/weights_biases https://www.linkedin.com/company/wandb Join the Weights & Biases Discord Server:https://discord.gg/CkZKRNnaf3

15 Helmi 202453min

The Power of AI in Search with You.com's Richard Socher

The Power of AI in Search with You.com's Richard Socher

In the latest episode of Gradient Dissent, Richard Socher, CEO of You.com, shares his insights on the power of AI in search. The episode focuses on how advanced language models like GPT-4 are transforming search engines and changing the way we interact with digital platforms. The discussion covers the practical applications and challenges of integrating AI into search functionality, as well as the ethical considerations and future implications of AI in our digital lives. Join us for an enlightening conversation on how AI and you.com are reshaping how we access and interact with information online.*Subscribe to Weights & Biases* → https://bit.ly/45BCkYzTimestamps: 00:00 - Introduction to Gradient Dissent Podcast 00:48 - Richard Socher’s Journey: From Linguistic Computer Science to AI 06:42 - The Genesis and Evolution of MetaMind 13:30 - Exploring You.com's Approach to Enhanced Search 18:15 - Demonstrating You.com's AI in Mortgage Calculations 24:10 - The Power of AI in Search: A Deep Dive with You.com 30:25 - Security Measures in Running AI-Generated Code 35:50 - Building a Robust and Secure AI Tech Stack 42:33 - The Role of AI in Automating and Transforming Digital Work 48:50 - Discussing Ethical Considerations and the Societal Impact of AI 55:15 - Envisioning the Future of AI in Daily Life and Work 01:02:00 - Reflecting on the Evolution of AI and Its Future Prospects 01:05:00 - Closing Remarks and Podcast Wrap-Up🎙 Get our podcasts on these platforms:Apple Podcasts: http://wandb.me/apple-podcastsSpotify: http://wandb.me/spotifyGoogle: http://wandb.me/gd_googleYouTube: http://wandb.me/youtubeConnect with Richard Socher:https://www.linkedin.com/in/richardsocher/ https://twitter.com/RichardSocher Follow Weights & Biases:https://twitter.com/weights_biases https://www.linkedin.com/company/wandb Join the Weights & Biases Discord Server:https://discord.gg/CkZKRNnaf3

1 Helmi 20241h 8min

AI’s Future: Investment & Impact with Sarah Guo and Elad Gil

AI’s Future: Investment & Impact with Sarah Guo and Elad Gil

Explore the Future of Investment & Impact in AI with Host Lukas Biewald and Guests Elad Gill and Sarah Guo of the No Priors podcast.Sarah is the founder of Conviction VC, an AI-centric $100 million venture fund. Elad, a seasoned entrepreneur and startup investor, boasts an impressive portfolio in over 40 companies, each valued at $1 billion or more, and wrote the influential "High Growth Handbook."Join us for a deep dive into the nuanced world of AI, where we'll explore its broader industry impact, focusing on how startups can seamlessly blend product-centric approaches with a balance of innovation and practical development.*Subscribe to Weights & Biases* → https://bit.ly/45BCkYzTimestamps:0:00 - Introduction 5:15 - Exploring Fine-Tuning vs RAG in AI10:30 - Evaluating AI Research for Investment15:45 - Impact of AI Models on Product Development20:00 - AI's Role in Evolving Job Markets25:15 - The Balance Between AI Research and Product Development30:00 - Code Generation Technologies in Software Engineering35:00 - AI's Broader Industry Implications40:00 - Importance of Product-Driven Approaches in AI Startups45:00 - AI in Various Sectors: Beyond Software Engineering50:00 - Open Source vs Proprietary AI Models55:00 - AI's Impact on Traditional Roles and Industries1:00:00 - Closing Thoughts Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.Follow Weights & Biases:YouTube: http://wandb.me/youtubeTwitter: https://twitter.com/weights_biases LinkedIn: https://www.linkedin.com/company/wandb Join the Weights & Biases Discord Server:https://discord.gg/CkZKRNnaf3#OCR #DeepLearning #AI #Modeling #ML

18 Tammi 20241h 4min

Revolutionizing AI Data Management with Jerry Liu, CEO of LlamaIndex

Revolutionizing AI Data Management with Jerry Liu, CEO of LlamaIndex

In the latest episode of Gradient Dissent, we explore the innovative features and impact of LlamaIndex in AI data management with Jerry Liu, CEO of LlamaIndex. Jerry shares insights on how LlamaIndex integrates diverse data formats with advanced AI technologies, addressing challenges in data retrieval, analysis, and conversational memory. We also delve into the future of AI-driven systems and LlamaIndex's role in this rapidly evolving field. This episode is a must-watch for anyone interested in AI, data science, and the future of technology.Timestamps:0:00 - Introduction 4:46 - Differentiating LlamaIndex in the AI framework ecosystem.9:00 - Discussing data analysis, search, and retrieval applications.14:17 - Exploring Retrieval Augmented Generation (RAG) and vector databases.19:33 - Implementing and optimizing One Bot in Discord.24:19 - Developing and evaluating datasets for AI systems.28:00 - Community contributions and the growth of LlamaIndex.34:34 - Discussing embedding models and the use of vector databases.39:33 - Addressing AI model hallucinations and fine-tuning.44:51 - Text extraction applications and agent-based systems in AI.49:25 - Community contributions to LlamaIndex and managing refactors.52:00 - Interactions with big tech's corpus and AI context length.54:59 - Final thoughts on underrated aspects of ML and challenges in AI.Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.Connect with Jerry:https://twitter.com/jerryjliu0https://www.linkedin.com/in/jerry-liu-64390071/Follow Weights & Biases:YouTube: http://wandb.me/youtubeTwitter: https://twitter.com/weights_biases LinkedIn: https://www.linkedin.com/company/wandb Join the Weights & Biases Discord Server:https://discord.gg/CkZKRNnaf3#OCR #DeepLearning #AI #Modeling #ML

4 Tammi 202457min

Bridging AI and Science: The Impact of Machine Learning on Material Innovation with Joe Spisak of Meta

Bridging AI and Science: The Impact of Machine Learning on Material Innovation with Joe Spisak of Meta

In the latest episode of Gradient Dissent, we hear from Joseph Spisak, Product Director, Generative AI @Meta, to explore the boundless impacts of AI and its expansive role in reshaping various sectors. We delve into the intricacies of models like GPT and Llama2, their influence on user experiences, and AI's groundbreaking contributions to fields like biology, material science, and green hydrogen production through the Open Catalyst Project. The episode also examines AI's practical business applications, from document summarization to intelligent note-taking, addressing the ethical complexities of AI deployment. We wrap up with a discussion on the significance of open-source AI development, community collaboration, and AI democratization. Tune in for valuable insights into the expansive world of AI, relevant to developers, business leaders, and tech enthusiasts.We discuss:0:00 Intro0:32 Joe is Back at Meta3:28 What Does Meta Get Out Of Putting Out LLMs?8:24 Measuring The Quality Of LLMs10:55 How Do You Pick The Sizes Of Models16:45 Advice On Choosing Which Model To Start With24:57 The Secret Sauce In The Training26:17 What Is Being Worked On Now33:00 The Safety Mechanisms In Llama 237:00 The Datasets Llama 2 Is Trained On38:00 On Multilingual Capabilities & Tone43:30 On The Biggest Applications Of Llama 247:25 On Why The Best Teams Are Built By Users54:01 The Culture Differences Of Meta vs Open Source57:39 The AI Learning Alliance1:01:34 Where To Learn About Machine Learning1:05:10 Why AI For Science Is Under-rated1:11:36 What Are The Biggest Issues With Real-World ApplicationsThanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.#OCR #DeepLearning #AI #Modeling #ML

7 Joulu 20231h 14min

Unlocking the Power of Language Models in Enterprise: A Deep Dive with Chris Van Pelt

Unlocking the Power of Language Models in Enterprise: A Deep Dive with Chris Van Pelt

In the premiere episode of Gradient Dissent Business, we're joined by Weights & Biases co-founder Chris Van Pelt for a deep dive into the world of large language models like GPT-3.5 and GPT-4. Chris bridges his expertise as both a tech founder and AI expert, offering key strategies for startups seeking to connect with early users, and for enterprises experimenting with AI. He highlights the melding of AI and traditional web development, sharing his insights on product evolution, leadership, and the power of customer conversations—even for the most introverted founders. He shares how personal development and authentic co-founder relationships enrich business dynamics. Join us for a compelling episode brimming with actionable advice for those looking to innovate with language models, all while managing the inherent complexities. Don't miss Chris Van Pelt's invaluable take on the future of AI in this thought-provoking installment of Gradient Dissent Business.We discuss:0:00 - Intro5:59 - Impactful relationships in Chris's life13:15 - Advice for finding co-founders16:25 - Chris's fascination with challenging problems22:30 - Tech stack for AI labs30:50 - Impactful capabilities of AI models36:24 - How this AI era is different47:36 - Advising large enterprises on language model integration51:18 - Using language models for business intelligence and automation52:13 - Closing thoughts and appreciationThanks for listening to the Gradient Dissent Business podcast, with hosts Lavanya Shukla and Caryn Marooney, brought to you by Weights & Biases. Be sure to click the subscribe button below, to keep your finger on the pulse of this fast-moving space and hear from other amazing guests#OCR #DeepLearning #AI #Modeling #ML

16 Marras 202352min

Providing Greater Access to LLMs with Brandon Duderstadt, Co-Founder and CEO of Nomic AI

Providing Greater Access to LLMs with Brandon Duderstadt, Co-Founder and CEO of Nomic AI

On this episode, we’re joined by Brandon Duderstadt, Co-Founder and CEO of Nomic AI. Both of Nomic AI’s products, Atlas and GPT4All, aim to improve the explainability and accessibility of AI.We discuss:- (0:55) What GPT4All is and its value proposition.- (6:56) The advantages of using smaller LLMs for specific tasks. - (9:42) Brandon’s thoughts on the cost of training LLMs. - (10:50) Details about the current state of fine-tuning LLMs. - (12:20) What quantization is and what it does. - (21:16) What Atlas is and what it allows you to do.- (27:30) Training code models versus language models.- (32:19) Details around evaluating different models.- (38:34) The opportunity for smaller companies to build open-source models. - (42:00) Prompt chaining versus fine-tuning models.Resources mentioned:Brandon Duderstadt - https://www.linkedin.com/in/brandon-duderstadt-a3269112a/Nomic AI - https://www.linkedin.com/company/nomic-ai/Nomic AI Website - https://home.nomic.ai/Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.#OCR #DeepLearning #AI #Modeling #ML

27 Heinä 20231h 1min

Exploring PyTorch and Open-Source Communities with Soumith Chintala, VP/Fellow of Meta, Co-Creator of PyTorch

Exploring PyTorch and Open-Source Communities with Soumith Chintala, VP/Fellow of Meta, Co-Creator of PyTorch

On this episode, we’re joined by Soumith Chintala, VP/Fellow of Meta and Co-Creator of PyTorch. Soumith and his colleagues’ open-source framework impacted both the development process and the end-user experience of what would become PyTorch.We discuss:- The history of PyTorch’s development and TensorFlow’s impact on development decisions.- How a symbolic execution model affects the implementation speed of an ML compiler.- The strengths of different programming languages in various development stages.- The importance of customer engagement as a measure of success instead of hard metrics.- Why community-guided innovation offers an effective development roadmap.- How PyTorch’s open-source nature cultivates an efficient development ecosystem.- The role of community building in consolidating assets for more creative innovation.- How to protect community values in an open-source development environment.- The value of an intrinsic organizational motivation structure.- The ongoing debate between open-source and closed-source products, especially as it relates to AI and machine learning.Resources:- Soumith Chintalahttps://www.linkedin.com/in/soumith/- Meta | LinkedInhttps://www.linkedin.com/company/meta/- Meta | Websitehttps://about.meta.com/- Pytorchhttps://pytorch.org/Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.#OCR #DeepLearning #AI #Modeling #ML

13 Heinä 20231h 8min

Suosittua kategoriassa Liike-elämä ja talous

sijotuskasti
mimmit-sijoittaa
psykopodiaa-podcast
rss-rahapodi
puheenaihe
pomojen-suusta
rss-rahamania
ostan-asuntoja-podcast
rss-h-asselmoilanen
rss-neuvottelija-sami-miettinen
leadcast
asuntoasiaa-paivakirjat
rss-lahtijat
rss-bisnesta-bebeja
rss-tyoelamasta-podcast
inderespodi
taloudellinen-mielenrauha
rss-myyntipodi
lakicast
rss-startup-ministerio