
Robert Nishihara — The State of Distributed Computing in ML
The story of Ray and what lead Robert to go from reinforcement learning researcher to creating open-source tools for machine learning and beyond Robert is currently working on Ray, a high-performance distributed execution framework for AI applications. He studied mathematics at Harvard. He’s broadly interested in applied math, machine learning, and optimization, and was a member of the Statistical AI Lab, the AMPLab/RISELab, and the Berkeley AI Research Lab at UC Berkeley. robertnishihara.com https://anyscale.com/ https://github.com/ray-project/ray https://twitter.com/robertnishihara https://www.linkedin.com/in/robert-nishihara-b6465444/ Topics covered: 0:00 sneak peak + intro 1:09 what is Ray? 3:07 Spark and Ray 5:48 reinforcement learning 8:15 non-ml use cases of ray 10:00 RL in the real world and and common uses of Ray 13:49 Ppython in ML 16:38 from grad school to ML tools company 20:40 pulling product requirements in surprising directions 23:25 how to manage a large open source community 27:05 Ray Tune 29:35 where do you see bottlenecks in production? 31:39 An underrated aspect of Machine Learning Visit our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast Get our podcast on Apple, Spotify, and Google! Apple Podcasts: https://bit.ly/2WdrUvI Spotify: https://bit.ly/2SqtadF Google: http://tiny.cc/GD_Google Subscribe to our YouTube channel for videos of these podcasts and more Machine learning-related videos: https://www.youtube.com/c/WeightsBiases We started Weights and Biases to build tools for Machine Learning practitioners because we care a lot about the impact that Machine Learning can have in the world and we love working in the trenches with the people building these models. One of the most fun things about these building tools has been the conversations with these ML practitioners and learning about the interesting things they’re working on. This process has been so fun that we wanted to open it up to the world in the form of our new podcast called Gradient Dissent. We hope you have as much fun listening to it as we had making it! Join our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://tiny.cc/wb-salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://bit.ly/wb-slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices. https://app.wandb.ai/gallery
13 Nov 202035min

Ines & Sofie — Building Industrial-Strength NLP Pipelines
Sofie and Ines walk us through how the new spaCy library helps build end to end SOTA natural language processing workflows. Ines Montani is the co-founder of Explosion AI, a digital studio specializing in tools for AI technology. She's a core developer of spaCy, one of the leading open-source libraries for Natural Language Processing in Python and Prodigy, a new data annotation tool powered by active learning. Before founding Explosion AI, she was a freelance front-end developer and strategist. https://twitter.com/_inesmontani Sofie Van Landeghem is a Natural Language Processing and Machine Learning engineer at Explosion.ai. She is a Software Engineer at heart, with an absurd love for quality assurance and testing, introducing proper levels of abstraction, and ensuring code robustness and modularity. She has more than 12 years of experience in Natural Language Processing and Machine Learning, including in the pharmaceutical industry and the food industry. https://twitter.com/oxykodit https://spacy.io/ https://prodi.gy/ https://thinc.ai/ https://explosion.ai/ Topics covered: 0:00 Sneak peek 0:35 intro 2:29 How spaCy was started 6:11 Business model, open source 9:55 What was spaCy designed to solve? 12:23 advances in NLP and modern practices in industry 17:19 what differentiates spaCy from a more research focused NLP library? 19:28 Multi-lingual/domain specific support 23:52 spaCy V3 configuration 28:16 Thoughts on Python, Syphon, other programming languages for ML 33:45 Making things clear and reproducible 37:30 prodigy and getting good training data 44:09 most underrated aspect of ML 51:00 hardest part of putting models into production Visit our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast Get our podcast on Apple, Spotify, and Google! Apple Podcasts: bit.ly/2WdrUvI Spotify: bit.ly/2SqtadF Google:tiny.cc/GD_Google We started Weights and Biases to build tools for Machine Learning practitioners because we care a lot about the impact that Machine Learning can have in the world and we love working in the trenches with the people building these models. One of the most fun things about these building tools has been the conversations with these ML practitioners and learning about the interesting things they’re working on. This process has been so fun that we wanted to open it up to the world in the form of our new podcast called Gradient Dissent. We hope you have as much fun listening to it as we had making it! Join our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: tiny.cc/wb-salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: bit.ly/wb-slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices. app.wandb.ai/gallery
29 Okt 202058min

Daeil Kim — The Unreasonable Effectiveness of Synthetic Data
Supercharging computer vision model performance by generating years of training data in minutes. Daeil Kim is the co-founder and CEO of AI.Reverie(https://aireverie.com/), a startup that specializes in creating high quality synthetic training data for computer vision algorithms. Before that, he was a senior data scientist at the New York Times. And before that he got his PhD in computer science from Brown University, focusing on machine learning and Bayesian statistics. He's going to talk about tools that will advance machine learning progress, and he's going to talk about synthetic data. https://twitter.com/daeil Topics covered: 0:00 Diversifying content 0:23 Intro+bio 1:00 From liberal arts to synthetic data 8:48 What is synthetic data? 11:24 Real world examples of synthetic data 16:16 Understanding performance gains using synthetic data 21:32 The future of Synthetic data and AI.Reverie 23:21 The composition of people at AI.reverie and ML 28:28 The evolution of ML tools and systems that Daeil uses 33:16 Most underrated aspect of ML and common misconceptions 34:42 Biggest challenge in making synthetic data work in the real world Visit our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast Get our podcast on Apple, Spotify, and Google! Apple Podcasts: bit.ly/2WdrUvI Spotify: bit.ly/2SqtadF Google:tiny.cc/GD_Google We started Weights and Biases to build tools for Machine Learning practitioners because we care a lot about the impact that Machine Learning can have in the world and we love working in the trenches with the people building these models. One of the most fun things about these building tools has been the conversations with these ML practitioners and learning about the interesting things they’re working on. This process has been so fun that we wanted to open it up to the world in the form of our new podcast called Gradient Dissent. We hope you have as much fun listening to it as we had making it! Join our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: tiny.cc/wb-salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: bit.ly/wb-slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices. app.wandb.ai/gallery
15 Okt 202037min

Joaquin Candela — Definitions of Fairness
Joaquin chats about scaling and democratizing AI at Facebook, while understanding fairness and algorithmic bias. --- Joaquin Quiñonero Candela is Distinguished Tech Lead for Responsible AI at Facebook, where he aims to understand and mitigate the risks and unintended consequences of the widespread use of AI across Facebook. He was previously Director of Society and AI Lab and Director of Engineering for Applied ML. Before joining Facebook, Joaquin taught at the University of Cambridge, and worked at Microsoft Research. Connect with Joaquin: Personal website: https://quinonero.net/ Twitter: https://twitter.com/jquinonero LinkedIn: https://www.linkedin.com/in/joaquin-qui%C3%B1onero-candela-440844/ --- Topics Discussed: 0:00 Intro, sneak peak 0:53 Looking back at building and scaling AI at Facebook 10:31 How do you ship a model every week? 15:36 Getting buy-in to use a system 19:36 More on ML tools 24:01 Responsible AI at Facebook 38:33 How to engage with those effected by ML decisions 41:54 Approaches to fairness 53:10 How to know things are built right 59:34 Diversity, inclusion, and AI 1:14:21 Underrated aspect of AI 1:16:43 Hardest thing when putting models into production Transcript: http://wandb.me/gd-joaquin-candela Links Discussed: Race and Gender (2019): https://arxiv.org/pdf/1908.06165.pdf Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning (2019): https://arxiv.org/abs/1912.10389 Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification (2018): http://proceedings.mlr.press/v81/buolamwini18a.html --- Get our podcast on these platforms: Apple Podcasts: http://wandb.me/apple-podcasts Spotify: http://wandb.me/spotify Google Podcasts: http://wandb.me/google-podcasts YouTube: http://wandb.me/youtube Soundcloud: http://wandb.me/soundcloud Join our community of ML practitioners where we host AMAs, share interesting projects and meet other people working in Deep Learning: http://wandb.me/slack Check out Fully Connected, which features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, industry leaders sharing best practices, and more: https://wandb.ai/fully-connected
1 Okt 20201h 19min

Richard Socher — The Challenges of Making ML Work in the Real World
Richard Socher, ex-Chief Scientist at Salesforce, joins us to talk about The AI Economist, NLP protein generation and biggest challenge in making ML work in the real world. Richard Socher was the Chief scientist (EVP) at Salesforce where he lead teams working on fundamental research(einstein.ai/), applied research, product incubation, CRM search, customer service automation and a cross-product AI platform for unstructured and structured data. Previously, he was an adjunct professor at Stanford’s computer science department and the founder and CEO/CTO of MetaMind(www.metamind.io/) which was acquired by Salesforce in 2016. In 2014, he got my PhD in the [CS Department](www.cs.stanford.edu/) at Stanford. He likes paramotoring and water adventures, traveling and photography. More info: - Forbes article: https://www.forbes.com/sites/gilpress/2017/05/01/emerging-artificial-intelligence-ai-leaders-richard-socher-salesforce/) with more info about Richard's bio. - CS224n - NLP with Deep Learning(http://cs224n.stanford.edu/) the class Richard used to teach. - TEDx talk(https://www.youtube.com/watch?v=8cmx7V4oIR8) about where AI is today and where it's going. Research: Google Scholar Link(https://scholar.google.com/citations?user=FaOcyfMAAAAJ&hl=en) The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies Arxiv link(https://arxiv.org/abs/2004.13332), blog(https://blog.einstein.ai/the-ai-economist/), short video(https://www.youtube.com/watch?v=4iQUcGyQhdA), Q&A(https://salesforce.com/company/news-press/stories/2020/4/salesforce-ai-economist/), Press: VentureBeat(https://venturebeat.com/2020/04/29/salesforces-ai-economist-taps-reinforcement-learning-to-generate-optimal-tax-policies/), TechCrunch(https://techcrunch.com/2020/04/29/salesforce-researchers-are-working-on-an-ai-economist-for-more-equitable-tax-policy/) ProGen: Language Modeling for Protein Generation: bioRxiv link(https://www.biorxiv.org/content/10.1101/2020.03.07.982272v2), [blog](https://blog.einstein.ai/progen/) ] Dye-sensitized solar cells under ambient light powering machine learning: towards autonomous smart sensors for the internet of things Issue11, (**Chemical Science 2020**). paper link(https://pubs.rsc.org/en/content/articlelanding/2020/sc/c9sc06145b#!divAbstract) CTRL: A Conditional Transformer Language Model for Controllable Generation: Arxiv link(https://arxiv.org/abs/1909.05858), code pre-trained and fine-tuning(https://github.com/salesforce/ctrl), blog(https://blog.einstein.ai/introducing-a-conditional-transformer-language-model-for-controllable-generation/) Genie: a generator of natural language semantic parsers for virtual assistant commands: PLDI 2019 pdf link(https://almond-static.stanford.edu/papers/genie-pldi19.pdf), https://almond.stanford.edu Topics Covered: 0:00 intro 0:42 the AI economist 7:08 the objective function and Gini Coefficient 12:13 on growing up in Eastern Germany and cultural differences 15:02 Language models for protein generation (ProGen) 27:53 CTRL: conditional transformer language model for controllable generation 37:52 Businesses vs Academia 40:00 What ML applications are important to salesforce 44:57 an underrated aspect of machine learning 48:13 Biggest challenge in making ML work in the real world Visit our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast Get our podcast on Soundcloud, Apple, Spotify, and Google! Soundcloud: https://bit.ly/2YnGjIq Apple Podcasts: https://bit.ly/2WdrUvI Spotify: https://bit.ly/2SqtadF Google: http://tiny.cc/GD_Google Weights and Biases makes developer tools for deep learning. Join our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://tiny.cc/wb-salon Join our community of ML practitioners: http://bit.ly/wb-slack Our gallery features curated machine learning reports by ML researchers. https://app.wandb.ai/gallery
29 Sep 202050min

Zack Chase Lipton — The Medical Machine Learning Landscape
How Zack went from being a musician to professor, how medical applications of Machine Learning are developing, and the challenges of counteracting bias in real world applications. Zachary Chase Lipton is an assistant professor of Operations Research and Machine Learning at Carnegie Mellon University. His research spans core machine learning methods and their social impact and addresses diverse application areas, including clinical medicine and natural language processing. Current research focuses include robustness under distribution shift, breast cancer screening, the effective and equitable allocation of organs, and the intersection of causal thinking with messy data. He is the founder of the Approximately Correct (approximatelycorrect.com) blog and the creator of Dive Into Deep Learning, an interactive open-source book drafted entirely through Jupyter notebooks. Zack’s blog - http://approximatelycorrect.com/ Detecting and Correcting for Label Shift with Black Box Predictors: https://arxiv.org/pdf/1802.03916.pdf Algorithmic Fairness from a Non-Ideal Perspective https://www.datascience.columbia.edu/data-good-zachary-lipton-lecture Jonas Peter’s lectures on causality: https://youtu.be/zvrcyqcN9Wo 0:00 Sneak peek: Is this a problem worth solving? 0:38 Intro 1:23 Zack’s journey from being a musician to a professor at CMU 4:45 Applying machine learning to medical imaging 10:14 Exploring new frontiers: the most impressive deep learning applications for healthcare 12:45 Evaluating the models – Are they ready to be deployed in hospitals for use by doctors? 19:16 Capturing the signals in evolving representations of healthcare data 27:00 How does the data we capture affect the predictions we make 30:40 Distinguishing between associations and correlations in data – Horror vs romance movies 34:20 The positive effects of augmenting datasets with counterfactually flipped data 39:25 Algorithmic fairness in the real world 41:03 What does it mean to say your model isn’t biased? 43:40 Real world implications of decisions to counteract model bias 49:10 The pragmatic approach to counteracting bias in a non-ideal world 51:24 An underrated aspect of machine learning 55:11 Why defining the problem is the biggest challenge for machine learning in the real world Visit our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast Get our podcast on YouTube, Apple, and Spotify! YouTube: https://www.youtube.com/c/WeightsBiases Soundcloud: https://bit.ly/2YnGjIq Apple Podcasts: https://bit.ly/2WdrUvI Spotify: https://bit.ly/2SqtadF We started Weights and Biases to build tools for Machine Learning practitioners because we care a lot about the impact that Machine Learning can have in the world and we love working in the trenches with the people building these models. One of the most fun things about these building tools has been the conversations with these ML practitioners and learning about the interesting things they’re working on. This process has been so fun that we wanted to open it up to the world in the form of our new podcast called Gradient Dissent. We hope you have as much fun listening to it as we had making it! Join our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://tiny.cc/wb-salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://bit.ly/wandb-forum Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices. https://app.wandb.ai/gallery
17 Sep 202059min

Anthony Goldbloom — How to Win Kaggle Competitions
Anthony Goldbloom is the founder and CEO of Kaggle. In 2011 & 2012, Forbes Magazine named Anthony as one of the 30 under 30 in technology. In 2011, Fast Company featured him as one of the innovative thinkers who are changing the future of business. He and Lukas discuss the differences in strategies that do well in Kaggle competitions vs academia vs in production. They discuss his 2016 Ted talk through the lens of 2020, frameworks, and languages. Topics Discussed: 0:00 Sneak Peek 0:20 Introduction 0:45 methods used in kaggle competitions vs mainstream academia 2:30 Feature engineering 3:55 Kaggle Competitions now vs 10 years ago 8:35 Data augmentation strategies 10:06 Overfitting in Kaggle Competitions 12:53 How to not overfit 14:11 Kaggle competitions vs the real world 18:15 Getting into ML through Kaggle 22:03 Other Kaggle products 25:48 Favorite under appreciated kernel or dataset 28:27 Python & R 32:03 Frameworks 35:15 2016 Ted talk though the lens of 2020 37:54 Reinforcement Learning 38:43 What’s the topic in ML that people don’t talk about enough? 42:02 Where are the biggest bottlenecks in deploying ML software? Check out Kaggle: https://www.kaggle.com/ Follow Anthony on Twitter: https://twitter.com/antgoldbloom Watch his 2016 Ted Talk: https://www.ted.com/talks/anthony_goldbloom_the_jobs_we_ll_lose_to_machines_and_the_ones_we_won_t Visit our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast Get our podcast on Soundcloud, Apple, and Spotify! Soundcloud: https://bit.ly/2YnGjIq Apple Podcasts: https://bit.ly/2WdrUvI Spotify: https://bit.ly/2SqtadF We started Weights and Biases to build tools for Machine Learning practitioners because we care a lot about the impact that Machine Learning can have in the world and we love working in the trenches with the people building these models. One of the most fun things about these building tools has been the conversations with these ML practitioners and learning about the interesting things they’re working on. This process has been so fun that we wanted to open it up to the world in the form of our new podcast called Gradient Dissent. We hope you have as much fun listening to it as we had making it! Weights and Biases: We’re always free for academics and open source projects. Email carey@wandb.com with any questions or feature suggestions. * Blog: https://www.wandb.com/articles * Gallery: See what you can create with W&B - https://app.wandb.ai/gallery * Join our community of ML practitioners working on interesting problems - https://www.wandb.com/ml-community Host: Lukas Biewald - https://twitter.com/l2k Producer: Lavanya Shukla - https://twitter.com/lavanyaai Editor: Cayla Sharp - http://caylasharp.com/
9 Sep 202044min

Suzana Ilić — Cultivating Machine Learning Communities
👩💻Today our guest is Suzanah Ilić! Suzanah is a founder of Machine Learning Tokyo which is a nonprofit organization dedicated to democratizing Machine Learning. They are a team of ML Engineers and Researchers and a community of more than 3000 people. Machine Learning Tokyo: https://mltokyo.ai/ Follow Suzanah on twitter: https://twitter.com/suzatweet Check out our podcasts homepage for transcripts and more episodes! www.wandb.com/podcast 🔊 Get our podcast on Apple and Spotify! Apple Podcasts: https://bit.ly/2WdrUvI Spotify: https://bit.ly/2SqtadF We started Weights and Biases to build tools for Machine Learning practitioners because we care a lot about the impact that Machine Learning can have in the world and we love working in the trenches with the people building these models. One of the most fun things about these building tools has been the conversations with these ML practitioners and learning about the interesting things they’re working on. This process has been so fun that we wanted to open it up to the world in the form of our new podcast. We hope you have as much fun listening to it as we had making it. 👩🏼🚀Weights and Biases: We’re always free for academics and open source projects. Email carey@wandb.com with any questions or feature suggestions. - Blog: https://www.wandb.com/articles - Gallery: See what you can create with W&B - https://app.wandb.ai/gallery - Continue the conversation on our slack community - http://bit.ly/wandb-forum 🎙Host: Lukas Biewald - https://twitter.com/l2k 👩🏼💻Producer: Lavanya Shukla - https://twitter.com/lavanyaai 📹Editor: Cayla Sharp - http://caylasharp.com/
2 Sep 202034min