Predictive Models on Random Data
Data Skeptic22 Heinä 2016

Predictive Models on Random Data

This week is an insightful discussion with Claudia Perlich about some situations in machine learning where models can be built, perhaps by well-intentioned practitioners, to appear to be highly predictive despite being trained on random data. Our discussion covers some novel observations about ROC and AUC, as well as an informative discussion of leakage.

Much of our discussion is inspired by two excellent papers Claudia authored: Leakage in Data Mining: Formulation, Detection, and Avoidance and On Cross Validation and Stacking: Building Seemingly Predictive Models on Random Data. Both are highly recommended reading!

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(601)

Shadow Profiles on Social Networks

Shadow Profiles on Social Networks

Emre Sarigol joins me this week to discuss his paper Online Privacy as a Collective Phenomenon. This paper studies data collected from social networks and how the sharing behaviors of individuals can ...

13 Helmi 201538min

[MINI] The Chi-Squared Test

[MINI] The Chi-Squared Test

The Chi-Squared test is a methodology for hypothesis testing. When one has categorical data, in the form of frequency counts or observations (e.g. Vegetarian, Pescetarian, and Omnivore), split into tw...

6 Helmi 201517min

Mapping Reddit Topics with Randy Olson

Mapping Reddit Topics with Randy Olson

My quest this week is noteworthy a.i. researcher Randy Olson who joins me to share his work creating the Reddit World Map - a visualization that illuminates clusters in the reddit community based on u...

30 Tammi 201529min

[MINI] Partially Observable State Spaces

[MINI] Partially Observable State Spaces

When dealing with dynamic systems that are potentially undergoing constant change, its helpful to describe what "state" they are in.  In many applications the manner in which the state changes from on...

23 Tammi 201512min

Easily Fooling Deep Neural Networks

Easily Fooling Deep Neural Networks

My guest this week is Anh Nguyen, a PhD student at the University of Wyoming working in the Evolving AI lab. The episode discusses the paper Deep Neural Networks are Easily Fooled [pdf] by Anh Nguyen,...

16 Tammi 201528min

[MINI] Data Provenance

[MINI] Data Provenance

This episode introduces a high level discussion on the topic of Data Provenance, with more MINI episodes to follow to get into specific topics. Thanks to listener Sara L who wrote in to point out the ...

9 Tammi 201510min

Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill

Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill

I had the change to speak with well known Sharon Hill (@idoubtit) for the first episode of 2015. We discuss a number of interesting topics including the contributions Doubtful News makes to getting sc...

3 Tammi 201531min

[MINI] Belief in Santa

[MINI] Belief in Santa

In this quick holiday episode, we touch on how one would approach modeling the statistical distribution over the probability of belief in Santa Claus given age.

26 Joulu 20149min

Suosittua kategoriassa Tiede

rss-mita-tulisi-tietaa
rss-poliisin-mieli
tiedekulma-podcast
menologeja-tutkimusmatka-vaihdevuosiin
sotataidon-ytimessa
filocast-filosofian-perusteet
rss-duodecim-lehti
rss-astetta-parempi-elama-podcast
rss-lapsuuden-rakentajat-podcast
utelias-mieli
docemilia
radio-antro
rss-ranskaa-raakana
rss-kasvatuspsykologiaa-kaikille
rss-tiedetta-vai-tarinaa
rss-luontopodi-samuel-glassar-tutkii-luonnon-ihmeita
rss-sosiopodi