Tim & Heinrich — Democraticizing Reinforcement Learning Research

Tim & Heinrich — Democraticizing Reinforcement Learning Research

Since reinforcement learning requires hefty compute resources, it can be tough to keep up without a serious budget of your own. Find out how the team at Facebook AI Research (FAIR) is looking to increase access and level the playing field with the help of NetHack, an archaic rogue-like video game from the late 80s. Links discussed: The NetHack Learning Environment: https://ai.facebook.com/blog/nethack-learning-environment-to-advance-deep-reinforcement-learning/ Reinforcement learning, intrinsic motivation: https://arxiv.org/abs/2002.12292 Knowledge transfer: https://arxiv.org/abs/1910.08210 Tim Rocktäschel is a Research Scientist at Facebook AI Research (FAIR) London and a Lecturer in the Department of Computer Science at University College London (UCL). At UCL, he is a member of the UCL Centre for Artificial Intelligence and the UCL Natural Language Processing group. Prior to that, he was a Postdoctoral Researcher in the Whiteson Research Lab, a Stipendiary Lecturer in Computer Science at Hertford College, and a Junior Research Fellow in Computer Science at Jesus College, at the University of Oxford. https://twitter.com/_rockt Heinrich Kuttler is an AI and machine learning researcher at Facebook AI Research (FAIR) and before that was a research engineer and team lead at DeepMind. https://twitter.com/HeinrichKuttler https://www.linkedin.com/in/heinrich-kuttler/ Topics covered: 0:00 a lack of reproducibility in RL 1:05 What is NetHack and how did the idea come to be? 5:46 RL in Go vs NetHack 11:04 performance of vanilla agents, what do you optimize for 18:36 transferring domain knowledge, source diving 22:27 human vs machines intrinsic learning 28:19 ICLR paper - exploration and RL strategies 35:48 the future of reinforcement learning 43:18 going from supervised to reinforcement learning 45:07 reproducibility in RL 50:05 most underrated aspect of ML, biggest challenges? Get our podcast on these other platforms: Apple Podcasts: http://wandb.me/apple-podcasts Spotify: http://wandb.me/spotify Google: http://wandb.me/google-podcasts YouTube: http://wandb.me/youtube Soundcloud: http://wandb.me/soundcloud Tune in to our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://wandb.me/salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://wandb.me/slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices: https://wandb.ai/gallery

Jaksot(134)

The $64M Bet on an AI That Has to Be Right | Carina Hong, CEO of Axiom

The $64M Bet on an AI That Has to Be Right | Carina Hong, CEO of Axiom

Formal verification already consumes years of human effort.In this episode, Lukas Biewald talks with Carina Hong, Founder & CEO of Axiom, about why verification is becoming the real bottleneck in high...

5 Helmi 50min

What a $42B Software Co. Really Spends on AI Tools

What a $42B Software Co. Really Spends on AI Tools

“I don't worry about being replaced by AI. I worry about being replaced by someone who's really good at using AI.”Atlassian has 10,000+ engineers currently split-testing the world’s top AI coding tool...

20 Tammi 1h 7min

Inside the $41B AI Cloud Challenging Big Tech | CoreWeave SVP

Inside the $41B AI Cloud Challenging Big Tech | CoreWeave SVP

The future of AI training is shaped by one constraint: keeping GPUs fed.In this episode, Lukas Biewald talks with CoreWeave SVP Corey Sanders about why general-purpose clouds start to break down under...

6 Tammi 53min

Why Physical AI Needed a Completely New Data Stack

Why Physical AI Needed a Completely New Data Stack

The future of AI is physical. In this episode, Lukas Biewald talks to Nikolaus West, CEO of Rerun, about why the breakthrough required to get AI out of the lab and into the messy real world is blocked...

16 Joulu 20251h

The Engineering Behind the World’s Most Advanced Video AI

The Engineering Behind the World’s Most Advanced Video AI

Is video AI a viable path toward AGI? Runway ML founder Cristóbal Valenzuela joins Lukas Biewald just after Gen 4.5 reached the #1 position on the Video Arena Leaderboard, according to community votin...

1 Joulu 202514min

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

In this episode of Gradient Dissent, Lukas Biewald talks with Tuhin Srivastava, CEO and founder of Baseten, one of the fastest-growing companies in the AI inference ecosystem. Tuhin shares the real st...

18 Marras 202559min

The Startup Powering The Data Behind AGI

The Startup Powering The Data Behind AGI

In this episode of Gradient Dissent, Lukas Biewald talks with the CEO & founder of Surge AI, the billion-dollar company quietly powering the next generation of frontier LLMs. They discuss Surge's orig...

16 Syys 202556min

Arvind Jain on Building Glean and the Future of Enterprise AI

Arvind Jain on Building Glean and the Future of Enterprise AI

In this episode of Gradient Dissent, Lukas Biewald sits down with Arvind Jain, CEO and founder of Glean. They discuss Glean's evolution from solving enterprise search to building agentic AI tools that...

5 Elo 202543min

Suosittua kategoriassa Liike-elämä ja talous

sijotuskasti
mimmit-sijoittaa
rss-rahapodi
psykopodiaa-podcast
rss-rahamania
herrasmieshakkerit
ostan-asuntoja-podcast
rss-seuraava-potilas
rahapuhetta
rss-20-30-40-podcast
taloudellinen-mielenrauha
pomojen-suusta
rss-lahtijat
rss-karon-grilli
rss-inderes-femme
rss-myynnilla-on-asiaa-kert-kenner
rss-draivi
rss-startup-ministerio
rss-bisnesta-bebeja
rss-vaikuttavan-opettajan-vierella