Annotator Bias
Data Skeptic23 Nov 2019

Annotator Bias

The modern deep learning approaches to natural language processing are voracious in their demands for large corpora to train on. Folk wisdom estimates used to be around 100k documents were required for effective training. The availability of broadly trained, general-purpose models like BERT has made it possible to do transfer learning to achieve novel results on much smaller corpora.

Thanks to these advancements, an NLP researcher might get value out of fewer examples since they can use the transfer learning to get a head start and focus on learning the nuances of the language specifically relevant to the task at hand. Thus, small specialized corpora are both useful and practical to create.

In this episode, Kyle speaks with Mor Geva, lead author on the recent paper Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets, which explores some unintended consequences of the typical procedure followed for generating corpora.

Source code for the paper available here: https://github.com/mega002/annotator_bias

Avsnitt(590)

Listener Survey Review

Listener Survey Review

In this episode, Kyle and Linhda review the results of our recent survey. Hear all about the demographic details and how we interpret these results.

11 Aug 202023min

Human Computer Interaction and Online Privacy

Human Computer Interaction and Online Privacy

Moses Namara from the HATLab joins us to discuss his research into the interaction between privacy and human-computer interaction.

27 Juli 202032min

Authorship Attribution of Lennon McCartney Songs

Authorship Attribution of Lennon McCartney Songs

Mark Glickman joins us to discuss the paper Data in the Life: Authorship Attribution in Lennon-McCartney Songs.

20 Juli 202033min

GANs Can Be Interpretable

GANs Can Be Interpretable

Erik Härkönen joins us to discuss the paper GANSpace: Discovering Interpretable GAN Controls. During the interview, Kyle makes reference to this amazing interpretable GAN controls video and it's accompanying codebase found here. Erik mentions the GANspace collab notebook which is a rapid way to try these ideas out for yourself.

11 Juli 202026min

Sentiment Preserving Fake Reviews

Sentiment Preserving Fake Reviews

David Ifeoluwa Adelani joins us to discuss Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection.

6 Juli 202028min

Interpretability Practitioners

Interpretability Practitioners

Sungsoo Ray Hong joins us to discuss the paper Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs.

26 Juni 202032min

Facial Recognition Auditing

Facial Recognition Auditing

Deb Raji joins us to discuss her recent publication Saving Face: Investigating the Ethical Concerns of Facial Recognition Auditing.

19 Juni 202047min

Robust Fit to Nature

Robust Fit to Nature

Uri Hasson joins us this week to discuss the paper Robust-fit to Nature: An Evolutionary Perspective on Biological (and Artificial) Neural Networks.

12 Juni 202038min

Populärt inom Vetenskap

svd-nyhetsartiklar
p3-dystopia
dumma-manniskor
allt-du-velat-veta
doden-hjarnan-kemisten
kapitalet-en-podd-om-ekonomi
rss-ufobortom-rimligt-tvivel
dumforklarat
sexet
rss-vetenskapsradion
rss-vetenskapspodden
paranormalt-med-caroline-giertz
rss-vetenskapsradion-2
det-morka-psyket
rss-i-hjarnan-pa-louise-epstein
bildningspodden
medicinvetarna
rss-spraket
barnpsykologerna
rss-personlighetspodden