Data Skeptic25 Dec 2015

2015 Holiday Special

Today's episode is a reading of Isaac Asimov's The Machine that Won the War. I can't think of a story that's more appropriate for Data Skeptic.

Upptäck Premium

Prova 14 dagar kostnadsfritt

Skaffa Premium

Avsnitt(589)

Urban Congestion

Urban congestion effects every person living in a city of any reasonable size. Lewis Lehe joins us in this episode to share his work on downtown congestion pricing. We explore topics of how different pricing mechanisms effect congestion as well as how data visualization can inform choices. You can find examples of Lewis's work at setosa.io. His paper which we discussed during the interview isDistance-dependent congestion pricing for downtown zones. On this episode, we discuss State of California data which can be found at pems.dot.ca.gov.

16 Sep 201635min

[MINI] Heteroskedasticity

Heteroskedasticity is a term used to describe a relationship between two variables which has unequal variance over the range. For example, the variance in the length of a cat's tail almost certainly changes (grows) with age. On the other hand, the average amount of chewing gum a person consume probably has a consistent variance over a wide range of human heights. We also discuss some issues with the visualization shown in the tweet embedded below.

9 Sep 20168min

Music21

Our guest today is Michael Cuthbert, an associate professor of music at MIT and principal investigator of the Music21 project, which we focus our discussion on today. Music21 is a python library making analysis of music accessible and fun. It supports integration with popular formats such as MIDI, MusicXML, Lilypond, and others. It's also well integrated with The Elvis Project, enabling users to import large volumes of music for easy analysis. Music21 is a great platform for musicologists and machine learning researchers alike to explore patterns and structure in music.

2 Sep 201634min

[MINI] Paxos

Paxos is a protocol for arriving a consensus in a distributed computing system which accounts for unreliability of the nodes. We discuss how this might be used in the real world in the event of a massive disaster.

26 Aug 201614min

Trusting Machine Learning Models with LIME

Machine learning models are often criticized for being black boxes. If a human cannot determine why the model arrives at the decision it made, there's good cause for skepticism. Classic inspection approaches to model interpretability are only useful for simple models, which are likely to only cover simple problems. The LIME project seeks to help us trust machine learning models. At a high level, it takes advantage of local fidelity. For a given example, a separate model trained on neighbors of the example are likely to reveal the relevant features in the local input space to reveal details about why the model arrives at it's conclusion. In this episode, Marco Tulio Ribeiro joins us to discuss how LIME (Locally Interpretable Model-Agnostic Explanations) can help users trust machine learning models. The accompanying paper is titled "Why Should I Trust You?": Explaining the Predictions of Any Classifier.

19 Aug 201635min

[MINI] ANOVA

Analysis of variance is a method used to evaluate differences between the two or more groups. It works by breaking down the total variance of the system into the between group variance and within group variance. We discuss this method in the context of wait times getting coffee at Starbucks.

12 Aug 201612min

Machine Learning on Images with Noisy Human-centric Labels

When humans describe images, they have a reporting bias, in that the report only what they consider important. Thus, in addition to considering whether something is present in an image, one should consider whether it is also relevant to the image before labeling it. Ishan Misra joins us this week to discuss his recent paper Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels which explores a novel architecture for learning to distinguish presence and relevance. This work enables web-scale datasets to be useful for training, not just well groomed hand labeled corpora.

5 Aug 201623min

[MINI] Survival Analysis

Survival analysis techniques are useful for studying the longevity of groups of elements or individuals, taking into account time considerations and right censorship. This episode explores how survival analysis can describe marriages, in particular, using the non-parametric Cox proportional hazard model. This episode discusses some good summaries of survey data on marriage and divorce which can be found here. The python lifelines library is a good place to get started for people that want to do some hands on work.

29 Juli 201614min

Premium

99 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill

Prova 14 dagar gratis

Premium

129 kr/ månad

Tillgång till alla Premium-poddar
Reklamfritt premium-innehåll
Avsluta när du vill
Ett extra konto

Prova 14 dagar gratis

2015 Holiday Special

Upptäck Premium

Avsnitt(589)

Urban Congestion

[MINI] Heteroskedasticity

Music21

[MINI] Paxos

Trusting Machine Learning Models with LIME

[MINI] ANOVA

Machine Learning on Images with Noisy Human-centric Labels

[MINI] Survival Analysis

Allt en och samma app

Noga utvalt innehåll

Fortsätt när du vill

Premium

Premium

Populärt inom Vetenskap

Berättelserna och rösterna du älskar att lyssna på