Building the howto100m Video Corpus
Data Skeptic19 Elo 2019

Building the howto100m Video Corpus

Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen.

This episode is a discussion of the HowTo100m dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities.

Related Links

The paper will be presented at ICCV 2019

@antoine77340

Antoine on Github

Antoine's homepage

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(601)

Streetlight Outage and Crime Rate Analysis with Zach Seeskin

Streetlight Outage and Crime Rate Analysis with Zach Seeskin

This episode features a discussion with statistics PhD student Zach Seeskin about a project he was involved in as part of the Eric and Wendy Schmidt Data Science for Social Good Summer Fellowship.  Th...

18 Heinä 201433min

[MINI] Experimental Design

[MINI] Experimental Design

This episode loosely explores the topic of Experimental Design including hypothesis testing, the importance of statistical tests, and an everyday and business example.

11 Heinä 201415min

The Right (big data) Tool for the Job with Jay Shankar

The Right (big data) Tool for the Job with Jay Shankar

In this week's episode, we discuss applied solutions to big data problem with big data engineer Jay Shankar.  The episode explores approaches and design philosophy to solving real world big data busin...

7 Heinä 201449min

[MINI] Bayesian Updating

[MINI] Bayesian Updating

In this minisode, we discuss Bayesian Updating - the process by which one can calculate the most likely hypothesis might be true given one's older / prior belief and all new evidence.

27 Kesä 201411min

Personalized Medicine with Niki Athanasiadou

Personalized Medicine with Niki Athanasiadou

In the second full length episode of the podcast, we discuss the current state of personalized medicine and the advancements in genetics that have made it possible.

20 Kesä 201457min

[MINI] p-values

[MINI] p-values

In this mini, we discuss p-values and their use in hypothesis testing, in the context of an hypothetical experiment on plant flowering, and end with a reference to the Particle Fever documentary and h...

13 Kesä 201416min

Advertising Attribution with Nathan Janos

Advertising Attribution with Nathan Janos

A conversation with Convertro's Nathan Janos about methodologies used to help advertisers understand the affect each of their marketing efforts (print, SEM, display, skywriting, etc.) contributes to t...

6 Kesä 20141h 16min

[MINI] type i / type ii errors

[MINI] type i / type ii errors

In this first mini-episode of the Data Skeptic Podcast, we define and discuss type i and type ii errors (a.k.a. false positives and false negatives).

30 Touko 201411min

Suosittua kategoriassa Tiede

rss-poliisin-mieli
tiedekulma-podcast
rss-mita-tulisi-tietaa
docemilia
filocast-filosofian-perusteet
menologeja-tutkimusmatka-vaihdevuosiin
rss-duodecim-lehti
sotataidon-ytimessa
rss-tiedetta-vai-tarinaa
rss-lapsuuden-rakentajat-podcast
utelias-mieli
radio-antro
rss-bios-podcast
rss-ranskaa-raakana
rss-metsantuntijat-podcast
rss-luontopodi-samuel-glassar-tutkii-luonnon-ihmeita
rss-lihavuudesta-podcast
rss-sosiopodi