2015 Holiday Special
Data Skeptic25 Joulu 2015

2015 Holiday Special

Today's episode is a reading of Isaac Asimov's The Machine that Won the War. I can't think of a story that's more appropriate for Data Skeptic.

Jaksot(589)

Predictive Models on Random Data

Predictive Models on Random Data

This week is an insightful discussion with Claudia Perlich about some situations in machine learning where models can be built, perhaps by well-intentioned practitioners, to appear to be highly predictive despite being trained on random data. Our discussion covers some novel observations about ROC and AUC, as well as an informative discussion of leakage. Much of our discussion is inspired by two excellent papers Claudia authored: Leakage in Data Mining: Formulation, Detection, and Avoidance and On Cross Validation and Stacking: Building Seemingly Predictive Models on Random Data. Both are highly recommended reading!

22 Heinä 201636min

[MINI] Receiver Operating Characteristic (ROC) Curve

[MINI] Receiver Operating Characteristic (ROC) Curve

An ROC curve is a plot that compares the trade off of true positives and false positives of a binary classifier under different thresholds. The area under the curve (AUC) is useful in determining how discriminating a model is. Together, ROC and AUC are very useful diagnostics for understanding the power of one's model and how to tune it.

15 Heinä 201611min

Multiple Comparisons and Conversion Optimization

Multiple Comparisons and Conversion Optimization

I'm joined by Chris Stucchio this week to discuss how deliberate or uninformed statistical practitioners can derive spurious and arbitrary results via multiple comparisons. We discuss p-hacking and a variety of other important lessons and tips for proper analysis. You can enjoy Chris's writing on his blog at chrisstucchio.com and you may also like his recent talk Multiple Comparisons: Make Your Boss Happy with False Positives, Guarenteed.

8 Heinä 201630min

[MINI] Leakage

[MINI] Leakage

If you'd like to make a good prediction, your best bet is to invent a time machine, visit the future, observe the value, and return to the past. For those without access to time travel technology, we need to avoid including information about the future in our training data when building machine learning models. Similarly, if any other feature whose value would not actually be available in practice at the time you'd want to use the model to make a prediction, is a feature that can introduce leakage to your model.

1 Heinä 201612min

Predictive Policing

Predictive Policing

Kristian Lum (@KLdivergence) joins me this week to discuss her work at @hrdag on predictive policing. We also discuss Multiple Systems Estimation, a technique for inferring statistical information about a population from separate sources of observation. If you enjoy this discussion, check out the panel Tyranny of the Algorithm? Predictive Analytics & Human Rights which was mentioned in the episode.

24 Kesä 201636min

[MINI] The CAP Theorem

[MINI] The CAP Theorem

Distributed computing cannot guarantee consistency, accuracy, and partition tolerance. Most system architects need to think carefully about how they should appropriately balance the needs of their application across these competing objectives. Linh Da and Kyle discuss the CAP Theorem using the analogy of a phone tree for alerting people about a school snow day.

17 Kesä 201610min

Detecting Terrorists with Facial Recognition?

Detecting Terrorists with Facial Recognition?

A startup is claiming that they can detect terrorists purely through facial recognition. In this solo episode, Kyle explores the plausibility of these claims.

10 Kesä 201633min

[MINI] Goodhart's Law

[MINI] Goodhart's Law

Goodhart's law states that "When a measure becomes a target, it ceases to be a good measure". In this mini-episode we discuss how this affects SEO, call centers, and Scrum.

3 Kesä 201610min

Suosittua kategoriassa Tiede

rss-mita-tulisi-tietaa
utelias-mieli
rss-poliisin-mieli
tiedekulma-podcast
hippokrateen-vastaanotolla
docemilia
rss-lihavuudesta-podcast
filocast-filosofian-perusteet
rss-duodecim-lehti
mielipaivakirja
radio-antro
rss-totta-vai-tuubaa
rss-astetta-parempi-elama-podcast
sotataidon-ytimessa
rss-tiedetta-vai-tarinaa
rss-ilmasto-kriisissa
rss-ihmisen-aani
rss-ylistys-elaimille
rss-luontopodi-samuel-glassar-tutkii-luonnon-ihmeita
rss-lapsuuden-rakentajat-podcast