Taming arXiv with Natural Language Processing w/ John Bohannon - TWiML Talk #136

Taming arXiv with Natural Language Processing w/ John Bohannon - TWiML Talk #136

In this episode i'm joined by John Bohannan, Director of Science at AI startup Primer. As you all may know, a few weeks ago we released my interview with Google legend Jeff Dean, which, by the way, you should definitely check if you haven’t already. Anyway, in that interview, Jeff mentions the recent explosion of machine learning papers on arXiv, which I responded to jokingly by asking whether Google had already developed the AI system to help them summarize and track all of them. While Jeff didn’t have anything specific to offer, a listener reached out and let me know that John was in fact already working on this problem. In our conversation, John and I discuss his work on Primer Science, a tool that harvests content uploaded to arxiv, sorts it into natural topics using unsupervised learning, then gives relevant summaries of the activity happening in different innovation areas. We spend a good amount of time on the inner workings of Primer Science, including their data pipeline and some of the tools they use, how they determine “ground truth” for training their models, and the use of heuristics to supplement NLP in their processing. The notes for this show can be found at twimlai.com/talk/136

Populärt inom Politik & nyheter

aftonbladet-krim
svenska-fall
motiv
p3-krim
fordomspodden
rss-krimstad
rss-viva-fotboll
flashback-forever
blenda-2
aftonbladet-daily
rss-sanning-konsekvens
rss-vad-fan-hande
svd-nyhetsartiklar
rss-frandfors-horna
dagens-eko
rss-krimreportrarna
krimmagasinet
olyckan-inifran
rss-flodet
rss-expressen-dok