LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - TWiML Talk #44

LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - TWiML Talk #44

This week we have a very special interview to share with you! Those of you who’ve been receiving my newsletter for a while might remember that while in Switzerland last month, I had the pleasure of interviewing Jurgen Schmidhuber, in his lab IDSIA, which is the Dalle Molle Institute for Artificial Intelligence Research in Lugano, Switzerland, where he serves as Scientific Director. In addition to his role at IDSIA, Jurgen is also Co-Founder and Chief Scientist of NNaisense, a company that is using AI to build large-scale neural network solutions for “superhuman perception and intelligent automation.” Jurgen is an interesting, accomplished and in some circles controversial figure in the AI community and we covered a lot of very interesting ground in our discussion, so much so that I couldn't truly unpack it all until I had a chance to sit with it after the fact. We talked a bunch about his work on neural networks, especially LSTM’s, or Long Short-Term Memory networks, which are a key innovation behind many of the advances we’ve seen in deep learning and its application over the past few years. Along the way, Jurgen walks us through a deep learning history lesson that spans 50+ years. It was like walking back in time with the 3 eyed raven. I know you’re really going to enjoy this one, and by the way, this is definitely a nerd alert show! For the show notes, visit twimlai.com/talk/44

Jaksot(784)

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher, an open-source tool for analyzing and improving Deep Neural Networks (DNNs) based on principles fro...

5 Kesä 20251h 25min

Google I/O 2025 Special Edition - #733

Google I/O 2025 Special Edition - #733

Today, I’m excited to share a special crossover edition of the podcast recorded live from Google I/O 2025! In this episode, I join Shawn Wang aka Swyx from the Latent Space Podcast, to interview Logan...

28 Touko 202526min

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732

Today, we're joined by Sebastian Gehrmann, head of responsible AI in the Office of the CTO at Bloomberg, to discuss AI safety in retrieval-augmented generation (RAG) systems and generative AI in high-...

21 Touko 202557min

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

Today, we're joined by Mahesh Sathiamoorthy, co-founder and CEO of Bespoke Labs, to discuss how reinforcement learning (RL) is reshaping the way we build custom agents on top of foundation models. Mah...

13 Touko 20251h 1min

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730

Today, we're joined by Josh Tobin, member of technical staff at OpenAI, to discuss the company’s approach to building AI agents. We cover OpenAI's three agentic offerings—Deep Research for comprehensi...

6 Touko 20251h 7min

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729

Today, we're joined by Nidhi Rastogi, assistant professor at Rochester Institute of Technology to discuss Cyber Threat Intelligence (CTI), focusing on her recent project CTIBench—a benchmark for evalu...

30 Huhti 202556min

Generative Benchmarking with Kelly Hong - #728

Generative Benchmarking with Kelly Hong - #728

In this episode, Kelly Hong, a researcher at Chroma, joins us to discuss "Generative Benchmarking," a novel approach to evaluating retrieval systems, like RAG applications, using synthetic data. Kelly...

23 Huhti 202554min

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

In this episode, Emmanuel Ameisen, a research engineer at Anthropic, returns to discuss two recent papers: "Circuit Tracing: Revealing Language Model Computational Graphs" and "On the Biology of a Lar...

14 Huhti 20251h 34min

Suosittua kategoriassa Politiikka ja uutiset

uutiscast
aikalisa
politiikan-puskaradio
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
tervo-halme
rss-podme-livebox
aihe
viisupodi
rss-ulkopoditiikkaa
rss-asiastudio
rss-pinnalla
the-ulkopolitist
radio-antro
rss-vaalirankkurit-podcast
et-sa-noin-voi-sanoo-esittaa
rss-mina-ukkola
rss-polikulaari-pitka-kiekko-ja-muut-ts-podcastit
rss-tasta-on-kyse-ivan-puopolo-verkkouutiset
rss-girls-finish-f1rst