Building the howto100m Video Corpus
Data Skeptic19 Aug 2019

Building the howto100m Video Corpus

Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen.

This episode is a discussion of the HowTo100m dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities.

Related Links

The paper will be presented at ICCV 2019

@antoine77340

Antoine on Github

Antoine's homepage

Episoder(590)

Placement Laundering Fraud

Placement Laundering Fraud

There is an unsung kind of ad fraud brewing in the ad tech space — placement laundering fraud. On the show, Jeff Kline discusses what placement laundering fraud is, how it can be identified, and possible solutions to it. Listen to learn more.

15 Des 202232min

Data Clean Rooms

Data Clean Rooms

Bosko Milekic, the Co-founder of Optable, a data collaboration platform for the media and advertising industry, joins us today. Bosko talked about the clean rooms, the technology driving data privacy during collaboration. He discussed why clean rooms are gaining widespread adoption, and how users can exploit Optable's clean room platform for a secured data-sharing experience.

12 Des 202231min

Dark Patterns in Site Design

Dark Patterns in Site Design

Kerstin Bongard-Blanchy is a Research Associate at the University of Luxembourg. She joins us to discuss her study that investigated dark patterns in web designs. She discussed the results, the effect of dark patterns effect on users, whether an average user can detect them, and the way forward to a more ethical web space.

5 Des 202234min

Internet Advertising Bureau Media Lab

Internet Advertising Bureau Media Lab

We are joined by Anthony Katsur, the CEO of IAB Tech Lab. Anthony discusses standards within the ad tech industry. He explained how IAB Tech Lab set and propagates global standards, actions to ensure compliance from advertisers, and industry trends for a more privacy-centric ad tech space.

3 Des 202237min

Your Mouse Reveals Your Gender and Age

Your Mouse Reveals Your Gender and Age

When we navigate a webpage, it is fairly easy for our mouse movement to be tracked and collected. Today, Luis Leiva, a Professor of Computer Science discusses how these mouse tracking data can be used to predict age, gender and user attention. He also discusses the privacy concerns with mouse tracking data and possible ways it can be curtailed.

28 Nov 202239min

Measuring Web Search Behavior

Measuring Web Search Behavior

On the show, Aleksandra Urman and Mykola Makhortykh join us to discuss their work on the comparative analysis of web search behavior using web tracking data. They shared interesting results from their analysis, bordering around the user preferences for search engines, demographic patterns, and differences between how men and women surf the net.

21 Nov 202236min

StrategyQA and Big Bench

StrategyQA and Big Bench

Did Aristotle Use a Laptop? That's a question from the StrategyQA benchmark which highlights the stretch goals for current artificial intelligence systems. Answering a question like that requires several cognitive steps and reasoning. Constructing a dataset of similarly challenging questions is a major undertaking. On today's episode, Mor Geva returns to share details about the creation of StrategyQA and the larger Big Bench dataset it has been included in.

18 Nov 202241min

Ad Blockers Effect on News Consumption

Ad Blockers Effect on News Consumption

While at first glance, the use of ad blockers drops the revenue of news publishers, this may not be completely true. On the show today, Shunyao Yan, an Assistant Professor in Marketing at Leavey School of Business, Santa Clara University, discussed the effect of ad blockers on news consumption and how ad blockers can potentially be helpful for news publishers.

14 Nov 202238min

Populært innen Vitenskap

fastlegen
rekommandert
jss
tingenes-tilstand
tomprat-med-gunnar-tjomlid
rss-rekommandert
sinnsyn
villmarksliv
rss-paradigmepodden
rss-nysgjerrige-norge
vett-og-vitenskap-med-gaute-einevoll
dekodet-2
doktor-fives-podcast
forskningno
nordnorsk-historie
fjellsportpodden
tidlose-historier
rss-overskuddsliv
pod-britannia
nevropodden