
Problem Formulation for Machine Learning with Romer Rosales - TWiML Talk #149
In this episode, i'm joined by Romer Rosales, Director of AI at LinkedIn. We begin with a discussion of graphical models and approximate probability inference, and he helps me make an important connection in the way I think about that topic. We then review some of the applications of machine learning at LinkedIn, and how what Romer calls their ‘holistic approach’ guides the evolution of ML projects at LinkedIn. This leads us into a really interesting discussion about problem formulation and selecting the right objective function for a given problem. We then talk through some of the tools they’ve built to scale their data science efforts, including large-scale constrained optimization solvers, online hyperparameter optimization and more. This was a really fun conversation, that I’m sure you’ll enjoy! The notes for this show can be found at twimlai.com/talk/149.
11 Juni 201850min

AI for Materials Discovery with Greg Mulholland - TWiML Talk #148
In this episode I’m joined by Greg Mulholland, Founder and CEO of Citrine Informatics, which is applying AI to the discovery and development of new materials. Greg and I start out with an exploration of some of the challenges of the status quo in materials science, and what’s to be gained by introducing machine learning into this process. We discuss how limitations in materials manifest themselves, and Greg shares a few examples from the company’s work optimizing battery components and solar cells. We dig into the role and sources of data used in applying ML in materials, and some of the unique challenges to collecting it, and discuss the pipeline and algorithms Citrine uses to deliver its service. This was a fun conversation that spans physics, chemistry, and of course machine learning, and I hope you enjoy it. The notes for this show can be found at twimlai.com/talk/148.
7 Juni 201842min

Data Innovation & AI at Capital One with Adam Wenchel - TWiML Talk #147
In this episode I’m joined by Adam Wenchel, vice president of AI and Data Innovation at Capital One, to discuss how Machine Learning & AI are being integrated into their day-to-day practices, and how those advances benefit the customer. In our conversation, we look into a few of the many applications of AI at the bank, including fraud detection, money laundering, customer service, and automating back office processes. Adam describes some of the challenges of applying ML in financial services and how Capital One maintains consistent portfolio management practices across the organization. We also discuss how the bank has organized to scale their machine learning efforts, and the steps they’ve taken to overcome the talent shortage in the space. The notes for this show can be found at twimlai.com/talk/147.
4 Juni 201845min

Deep Gradient Compression for Distributed Training with Song Han - TWiML Talk #146
On today’s show I chat with Song Han, assistant professor in MIT’s EECS department, about his research on Deep Gradient Compression. In our conversation, we explore the challenge of distributed training for deep neural networks and the idea of compressing the gradient exchange to allow it to be done more efficiently. Song details the evolution of distributed training systems based on this idea, and provides a few examples of centralized and decentralized distributed training architectures such as Uber’s Horovod, as well as the approaches native to Pytorch and Tensorflow. Song also addresses potential issues that arise when considering distributed training, such as loss of accuracy and generalizability, and much more. The notes for this show can be found at twimlai.com/talk/146.
31 Maj 201846min

Masked Autoregressive Flow for Density Estimation with George Papamakarios - TWiML Talk #145
In this episode, University of Edinburgh Phd student George Papamakarios and I discuss his paper “Masked Autoregressive Flow for Density Estimation.” George walks us through the idea of Masked Autoregressive Flow, which uses neural networks to produce estimates of probability densities from a set of input examples. We discuss some of the related work that’s laid the groundwork for his research, including Inverse Autoregressive Flow, Real NVP and Masked Auto-encoders. We also look at the properties of probability density networks and discuss some of the challenges associated with this effort. The notes for this show can be found at twimlai.com/talk/145.
28 Maj 201834min

Training Data for Computer Vision at Figure Eight with Qazaleh Mirsharif - TWiML Talk #144
For today’s show, the last in our TrainAI series, I'm joined by Qazaleh Mirsharif, a machine learning scientist working on computer vision at Figure Eight. Qazaleh and I caught up at the TrainAI conference to discuss a couple of the projects she’s worked on in that field, namely her research into the classification of retinal images and her work on parking sign detection from Google Street View images. The former, which attempted to diagnose diseases like diabetic retinopathy using retinal scan images, is similar to the work I spoke with Ryan Poplin about on TWiML Talk #122. In my conversation with Qazaleh we focus on how she built her datasets for each of these projects and some of the key lessons she’s learned along the way. The notes for this show can be found at twimlai.com/talk/144. For series details, visit twimlai.com/trainai2018.
25 Maj 201821min

Agile Data Science with Sarah Aerni - TWiML Talk #143
Today we continue our TrainAI series with Sarah Aerni, Director of Data Science at Salesforce Einstein. Sarah and I sat down at the TrainAI conference to discuss her talk “Notes from the Field: The Platform, People, and Processes of Agile Data Science.” Sarah and I dig into the concept of agile data science, exploring what it means to her and how she’s seen it done at Salesforce and other places she’s worked. We also dig into the notion of machine learning platforms, which is also a keen area of interest for me. We discuss some of the common elements we’ve seen in ML platforms, and when it makes sense for an organization to start building one. The notes for this show can be found at twimlai.com/talk/143. For more details on the TrainAI series, visit twimlai.com/trainai2018
24 Maj 201838min

Tensor Operations for Machine Learning with Anima Anandkumar - TWiML Talk #142
In this episode of our TrainAI series, I sit down with Anima Anandkumar, Bren Professor at Caltech and Principal Scientist with Amazon Web Services. Anima joined me to discuss the research coming out of her “Tensorlab” at CalTech. In our conversation, we review the application of tensor operations to machine learning and discuss how an example problem–document categorization–might be approached using 3 dimensional tensors to discover topics and relationships between topics. We touch on multidimensionality, expectation maximization, and Amazon products Sagemaker and Comprehend. Anima also goes into how to tensorize neural networks and apply our understanding of tensor algebra to do perform better architecture searches. The notes for this show can be found at twimlai.com/talk/142. For series info, visit twimlai.com/trainai2018
23 Maj 201834min