
Parallelism and Acceleration for Large Language Models with Bryan Catanzaro - #507
Today we’re joined by Bryan Catanzaro, vice president of applied deep learning research at NVIDIA. Most folks know Bryan as one of the founders/creators of cuDNN, the accelerated library for deep neural networks. In our conversation, we explore his interest in high-performance computing and its recent overlap with AI, his current work on Megatron, a framework for training giant language models, and the basic approach for distributing a large language model on DGX infrastructure. We also discuss the three different kinds of parallelism, tensor parallelism, pipeline parallelism, and data parallelism, that Megatron provides when training models, as well as his work on the Deep Learning Super Sampling project and the role it's playing in the present and future of game development via ray tracing. The complete show notes for this episode can be found at twimlai.com/go/507.
5 Elo 202150min

Applying the Causal Roadmap to Optimal Dynamic Treatment Rules with Lina Montoya - #506
Today we close out our 2021 ICML series joined by Lina Montoya, a postdoctoral researcher at UNC Chapel Hill. In our conversation with Lina, who was an invited speaker at the Neglected Assumptions in Causal Inference Workshop, we explored her work applying Optimal Dynamic Treatment (ODT) to understand which kinds of individuals respond best to specific interventions in the US criminal justice system. We discuss the concept of neglected assumptions and how it connects to ODT rule estimation, as well as a breakdown of the causal roadmap, coined by researchers at UC Berkeley. Finally, Lina talks us through the roadmap while applying the ODT rule problem, how she’s applied a “superlearner” algorithm to this problem, how it was trained, and what the future of this research looks like. The complete show notes for this episode can be found at twimlai.com/go/506.
2 Elo 202154min

Constraint Active Search for Human-in-the-Loop Optimization with Gustavo Malkomes - #505
Today we continue our ICML series joined by Gustavo Malkomes, a research engineer at Intel via their recent acquisition of SigOpt. In our conversation with Gustavo, we explore his paper Beyond the Pareto Efficient Frontier: Constraint Active Search for Multiobjective Experimental Design, which focuses on a novel algorithmic solution for the iterative model search process. This new algorithm empowers teams to run experiments where they are not optimizing particular metrics but instead identifying parameter configurations that satisfy constraints in the metric space. This allows users to efficiently explore multiple metrics at once in an efficient, informed, and intelligent way that lends itself to real-world, human-in-the-loop scenarios. The complete show notes for this episode can be found at twimlai.com/go/505.
29 Heinä 202150min

Fairness and Robustness in Federated Learning with Virginia Smith -#504
Today we kick off our ICML coverage joined by Virginia Smith, an assistant professor in the Machine Learning Department at Carnegie Mellon University. In our conversation with Virginia, we explore her work on cross-device federated learning applications, including where the distributed learning aspects of FL are relative to the privacy techniques. We dig into her paper from ICML, Ditto: Fair and Robust Federated Learning Through Personalization, what fairness means in contrast to AI ethics, the particulars of the failure modes, the relationship between models, and the things being optimized across devices, and the tradeoffs between fairness and robustness. We also discuss a second paper, Heterogeneity for the Win: One-Shot Federated Clustering, how the proposed method makes heterogeneity beneficial in data, how the heterogeneity of data is classified, and some applications of FL in an unsupervised setting. The complete show notes for this episode can be found at twimlai.com/go/504.
26 Heinä 202136min

Scaling AI at H&M Group with Errol Koolmeister - #503
Today we’re joined by Errol Koolmeister, the head of AI foundation at H&M Group. In our conversation with Errol, we explore H&M’s AI journey, including its wide adoption across the company in 2016, and the various use cases in which it's deployed like fashion forecasting and pricing algorithms. We discuss Errol’s first steps in taking on the challenge of scaling AI broadly at the company, the value-added learning from proof of concepts, and how to align in a sustainable, long-term way. Of course, we dig into the infrastructure and models being used, the biggest challenges faced, and the importance of managing the project portfolio, while Errol shares their approach to building infra for a specific product with many products in mind.
22 Heinä 202141min

Evolving AI Systems Gracefully with Stefano Soatto - #502
Today we’re joined by Stefano Soatto, VP of AI applications science at AWS and a professor of computer science at UCLA. Our conversation with Stefano centers on recent research of his called Graceful AI, which focuses on how to make trained systems evolve gracefully. We discuss the broader motivation for this research and the potential dangers or negative effects of constantly retraining ML models in production. We also talk about research into error rate clustering, the importance of model architecture when dealing with problems of model compression, how they’ve solved problems of regression and reprocessing by utilizing existing models, and much more. The complete show notes for this episode can be found at twimlai.com/go/502.
19 Heinä 202149min

ML Innovation in Healthcare with Suchi Saria - #501
Today we’re joined by Suchi Saria, the founder and CEO of Bayesian Health, the John C. Malone associate professor of computer science, statistics, and health policy, and the director of the machine learning and healthcare lab at Johns Hopkins University. Suchi shares a bit about her journey to working in the intersection of machine learning and healthcare, and how her research has spanned across both medical policy and discovery. We discuss why it has taken so long for machine learning to become accepted and adopted by the healthcare infrastructure and where exactly we stand in the adoption process, where there have been “pockets” of tangible success. Finally, we explore the state of healthcare data, and of course, we talk about Suchi’s recently announced startup Bayesian Health and their goals in the healthcare space, and an accompanying study that looks at real-time ML inference in an EMR setting. The complete show notes for this episode can be found at twimlai.com/go/501.
15 Heinä 202145min

Cross-Device AI Acceleration, Compilation & Execution with Jeff Gehlhaar - #500
Today we’re joined by a friend of the show Jeff Gehlhaar, VP of technology and the head of AI software platforms at Qualcomm. In our conversation with Jeff, we cover a ton of ground, starting with a bit of exploration around ML compilers, what they are, and their role in solving issues of parallelism. We also dig into the latest additions to the Snapdragon platform, AI Engine Direct, and how it works as a bridge to bring more capabilities across their platform, how benchmarking works in the context of the platform, how the work of other researchers we’ve spoken to on compression and quantization finds its way from research to product, and much more! After you check out this interview, you can look below for some of the other conversations with researchers mentioned. The complete show notes for this episode can be found at twimlai.com/go/500.
12 Heinä 202141min