36 - Attention Is All You Need, with Ashish Vaswani and Jakob Uszkoreit
NLP Highlights23 Okt 2017

36 - Attention Is All You Need, with Ashish Vaswani and Jakob Uszkoreit

NIPS 2017 paper. We dig into the details of the Transformer, from the "attention is all you need" paper. Ashish and Jakob give us some motivation for replacing RNNs and CNNs with a more parallelizable self-attention mechanism, they describe how this mechanism works, and then we spend the bulk of the episode trying to get their intuitions for _why_ it works. We discuss the positional encoding mechanism, multi-headed attention, trying to use these ideas to replace encoders in other models, and what the self-attention actually learns. Turns out that the lower layers learn something like n-grams (similar to CNNs), and the higher layers learn more semantic-y things, like coreference. https://www.semanticscholar.org/paper/Attention-Is-All-You-Need-Vaswani-Shazeer/0737da0767d77606169cbf4187b83e1ab62f6077 Minor correction: Talking about complexity equations without the paper in front of you can be tricky, and Ashish and Jakob may have gotten some of the details slightly wrong when we were discussing computational complexity. The high-level point is that self-attention is cheaper than RNNs when the hidden dimension is higher than the sequence length. See the paper for more details.

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(145)

Are LLMs safe?

Are LLMs safe?

Curious about the safety of LLMs? 🤔 Join us for an insightful new episode featuring Suchin Gururangan, Young Investigator at Allen Institute for Artificial Intelligence and Data Science Engineer at A...

29 Feb 202442min

"Imaginative AI" with Mohamed Elhoseiny

"Imaginative AI" with Mohamed Elhoseiny

This podcast episode features Dr. Mohamed Elhoseiny, a true luminary in the realm of computer vision with over a decade of groundbreaking research. As an Assistant Professor at KAUST, Dr. Elhoseiny's ...

8 Jan 202423min

142 - Science Of Science, with Kyle Lo

142 - Science Of Science, with Kyle Lo

Our first guest with this new format is Kyle Lo, the most senior lead scientist in the Semantic Scholar team at Allen Institute for AI (AI2), who kindly agreed to share his perspective on #Science of ...

28 Dec 202348min

141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld

141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld

In this special episode of NLP Highlights, we discussed building and open sourcing language models. What is the usual recipe for building large language models? What does it mean to open source them? ...

29 Juni 202329min

140 - Generative AI and Copyright, with Chris Callison-Burch

140 - Generative AI and Copyright, with Chris Callison-Burch

In this special episode, we chatted with Chris Callison-Burch about his testimony in the recent U.S. Congress Hearing on the Interoperability of AI and Copyright Law. We started by asking Chris about ...

6 Juni 202351min

139 - Coherent Long Story Generation, with Kevin Yang

139 - Coherent Long Story Generation, with Kevin Yang

How can we generate coherent long stories from language models? Ensuring that the generated story has long range consistency and that it conforms to a high level plan is typically challenging. In this...

24 Mars 202345min

138 - Compositional Generalization in Neural Networks, with Najoung Kim

138 - Compositional Generalization in Neural Networks, with Najoung Kim

Compositional generalization refers to the capability of models to generalize to out-of-distribution instances by composing information obtained from the training data. In this episode we chatted with...

20 Jan 202348min

137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal

137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal

We invited Urvashi Khandelwal, a research scientist at Google Brain to talk about nearest neighbor language and machine translation models. These models interpolate parametric (conditional) language m...

13 Jan 202335min

Populärt inom Vetenskap

p3-dystopia
dumma-manniskor
allt-du-velat-veta
kapitalet-en-podd-om-ekonomi
rss-vetenskapsradion
rss-ufobortom-rimligt-tvivel
svd-nyhetsartiklar
rss-spraket
paranormalt-med-caroline-giertz
medicinvetarna
rss-vetenskapsradion-2
halsorevolutionen
det-morka-psyket
sexet
rss-odla
dumforklarat
rss-broccolipodden-en-podcast-som-inte-handlar-om-broccoli
vetenskapsradion
hacka-livet
kvalificerat-hemligt-poddradio