Ep. 57: Kevin Wang, NeurIPS Best Paper Author and OpenAI Researcher
Delta Podcast28 Touko

Ep. 57: Kevin Wang, NeurIPS Best Paper Author and OpenAI Researcher

Kevin Wang is the first author of the NeurIPS 2025 Best Paper, titled "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities". He's currently a researcher at OpenAI, where he works on RL/reasoning. Before coming to OpenAI, Kevin studied CS at Princeton.Delta Institute (deltainstitutes.org) supports exceptional researchers and engineers, from academia to industry and beyond. They host technical events to bring great people together, a podcast that gives industry/academic leaders a platform to share their experiences, a small fellows program that builds a tight-knit community of exceptional people, and a grant program that provides compute/mentorship for research projects.Timestamps:00:00 Introduction00:26 Overview of the 1000 Layer Networks Paper00:42 Motivation and Background of the Research01:37 Self-Supervised Reinforcement Learning Paradigm04:16 Challenges and Innovations in Data Scaling06:23 Hindsight Experience Replay and Its Impact08:56 Classification vs Regression in Reinforcement Learning12:25 Training Stability and Architectural Components14:23 Key Results and Performance Gains17:23 Qualitative Behaviors and Representation Learning19:44 Scaling Depth and Batch Size23:06 Limits of Scaling in Reinforcement Learning23:55 Exploring Actor Loss and Layer Depth in Training24:51 Scaling Layers for Complex Tasks28:28 Challenges and Innovations in Deep Network Training30:36 Future Directions in Reinforcement Learning37:32 Personal Journey and Career Path

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(60)

Ep. 60: Ronak Malde, Trajectory CEO and Former DeepMind Researcher

Ep. 60: Ronak Malde, Trajectory CEO and Former DeepMind Researcher

Ronak Malde is the CEO of Trajectory (trajectory.ai), where he's working on bringing continual learning to enterprises. Before Trajectory, he worked on research at DeepMind and trained the SWE-1 model...

28 Touko 29min

Ep. 59: Alex Shan, Judgment Labs CEO

Ep. 59: Alex Shan, Judgment Labs CEO

Alex Shan is the CEO of Judgment Labs (judgmentlabs.ai), where he's working on building agent behavior monitoring infrastructure. Before Judgment, he worked at Juniper Networks and Stanford AI Lab.Del...

28 Touko 45min

Ep. 58: Andrew Dai, Elorian CEO and Former DeepMind Research Director

Ep. 58: Andrew Dai, Elorian CEO and Former DeepMind Research Director

Andrew Dai is the co-founder and CEO of Elorian, a new visual reasoning research and product lab. Before Elorian, Andrew was a research director at DeepMind, where he was the Gemini data area co-lead,...

28 Touko 32min

Ep. 56: Grace Li, Design Arena Co-Creator and Arcada Labs Co-Founder

Ep. 56: Grace Li, Design Arena Co-Creator and Arcada Labs Co-Founder

Grace Li is the co-founder of Arcada Labs (arcada.dev), creators of Design Arena, Prediction Arena, and Social Arena. Arcada's vision is to build portals that bridge AI to the real world by building r...

28 Touko 38min

Ep. 55: Karthik Narasimhan, GPT Co-Author and Princeton CS Professor

Ep. 55: Karthik Narasimhan, GPT Co-Author and Princeton CS Professor

Karthik Narasimhan is an associate professor at Princeton's CS Department and the co-director of Princeton NLP. He's led numerous projects at the intersection of language and agents, including ReACT, ...

1 Tammi 36min

Ep. 54: Michael Wornow: Kinetic Systems CEO and Stanford CS PhD

Ep. 54: Michael Wornow: Kinetic Systems CEO and Stanford CS PhD

Michael is the CEO of Kinetic Systems and recently finished his CS PhD at Stanford, where he was advised by Nigam Shah and Chris Ré. Before coming to Stanford, Michael studied CS and Statistics at Har...

31 Joulu 202541min

Ep. 53: Brian Zhan, Partner at Striker VP and Investor in Reflection, Skild, Periodic, Ricursive

Ep. 53: Brian Zhan, Partner at Striker VP and Investor in Reflection, Skild, Periodic, Ricursive

Brian Zhan is a partner at Striker Venture Partners. He's invested in several leading research startups, including Periodic Labs, Reflection AI, Skild AI, Dyna Robotics, Voyage AI, and more. Before co...

23 Joulu 202527min