Data Skeptic19 Aug 2019

Building the howto100m Video Corpus

Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen.

This episode is a discussion of the HowTo100m dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities.

Oppdag Premium

Prøv 14 dager gratis

Kjøp Premium

Episoder(590)

Adversarial Explanations

Walt Woods joins us to discuss his paper Adversarial Explanations for Understanding Image Classification Decisions and Improved Neural Network Robustness with co-authors Jack Chen and Christof Teuscher.

14 Feb 202036min

ObjectNet

Andrei Barbu joins us to discuss ObjectNet - a new kind of vision dataset. In contrast to ImageNet, ObjectNet seeks to provide images that are more representative of the types of images an autonomous machine is likely to encounter in the real world. Collecting a dataset in this way required careful use of Mechanical Turk to get Turkers to provide a corpus of images that removes some of the bias found in ImageNet. http://0xab.com/

7 Feb 202038min

Visualization and Interpretability

Enrico Bertini joins us to discuss how data visualization can be used to help make machine learning more interpretable and explainable. Find out more about Enrico at http://enrico.bertini.io/. More from Enrico with co-host Moritz Stefaner on the Data Stories podcast!

31 Jan 202035min

Interpretable One Shot Learning

We welcome Su Wang back to Data Skeptic to discuss the paper Distributional modeling on a diet: One-shot word learning from text only.

26 Jan 202030min

Fooling Computer Vision

Wiebe van Ranst joins us to talk about a project in which specially designed printed images can fool a computer vision system, preventing it from identifying a person. Their attack targets the popular YOLO2 pre-trained image recognition model, and thus, is likely to be widely applicable.

22 Jan 202025min

Algorithmic Fairness

This episode includes an interview with Aaron Roth author of The Ethical Algorithm.

14 Jan 202042min

Interpretability

Interpretability Machine learning has shown a rapid expansion into every sector and industry. With increasing reliance on models and increasing stakes for the decisions of models, questions of how models actually work are becoming increasingly important to ask. Welcome to Data Skeptic Interpretability. In this episode, Kyle interviews Christoph Molnar about his book Interpretable Machine Learning. Thanks to our sponsor, the Gartner Data & Analytics Summit going on in Grapevine, TX on March 23 – 26, 2020. Use discount code: dataskeptic. Music Our new theme song is #5 by Big D and the Kids Table. Incidental music by Tanuki Suit Riot.

7 Jan 202032min

NLP in 2019

A year in recap.

31 Des 201938min

Premium

99 kr/ måned

Tilgang til alle våre Premium-podkaster
Alle podkaster fra VG, Aftenposten, BT og SA
Reklamefritt Premium-innhold
Ingen bindingstid. Avslutt når du ønsker

Prøv 14 dager gratis

Premium

129 kr/ måned

Tilgang til alle Premium-podkaster
Alle podkaster fra VG, Aftenposten, BT og SA
Reklamefritt Premium-innhold
Ingen bindingstid. Avslutt når du ønsker
En Ekstra bruker

Prøv 14 dager gratis

Building the howto100m Video Corpus

Oppdag Premium

Episoder(590)

Adversarial Explanations

ObjectNet

Visualization and Interpretability

Interpretable One Shot Learning

Fooling Computer Vision

Algorithmic Fairness

Interpretability

NLP in 2019

Reklamefrie Premium-podkaster

Skap din egen podkastboble

Prøv 14 dager gratis

Premium

Premium

Populært innen Vitenskap

Historiene og stemmene du vil høre