
Better know a distribution: the Poisson distribution
This is a re-release of an episode that originally ran on October 21, 2018. The Poisson distribution is a probability distribution function used to for events that happen in time or space. It’s super...
2 Maalis 202031min

The Lottery Ticket Hypothesis
Recent research into neural networks reveals that sometimes, not all parts of the neural net are equally responsible for the performance of the network overall. Instead, it seems like (in some neural...
23 Helmi 202019min

Interesting technical issues prompted by GDPR and data privacy concerns
Data privacy is a huge issue right now, after years of consumers and users gaining awareness of just how much of their personal data is out there and how companies are using it. Policies like GDPR are...
17 Helmi 202020min

Thinking of data science initiatives as innovation initiatives
Put yourself in the shoes of an executive at a big legacy company for a moment, operating in virtually any market vertical: you’re constantly hearing that data science is revolutionizing the world and...
10 Helmi 202017min

Building a curriculum for educating data scientists: Interview with Prof. Xiao-Li Meng
As demand for data scientists grows, and it remains as relevant as ever that practicing data scientists have a solid methodological and technical foundation for their work, higher education institutio...
2 Helmi 202031min

Running experiments when there are network effects
Traditional A/B tests assume that whether or not one person got a treatment has no effect on the experiment outcome for another person. But that’s not a safe assumption, especially when there are netw...
27 Tammi 202024min

Zeroing in on what makes adversarial examples possible
Adversarial examples are really, really weird: pictures of penguins that get classified with high certainty by machine learning algorithms as drumsets, or random noise labeled as pandas, or any one of...
20 Tammi 202022min

Unsupervised Dimensionality Reduction: UMAP vs t-SNE
Dimensionality reduction redux: this episode covers UMAP, an unsupervised algorithm designed to make high-dimensional data easier to visualize, cluster, etc. It’s similar to t-SNE but has some advanta...
13 Tammi 202029min
