He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]

He Co-Invented the Transformer. Now: Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]

The Transformer architecture (which powers ChatGPT and nearly all modern AI) might be trapping the industry in a localized rut, preventing us from finding true intelligent reasoning, according to the person who co-invented it. Llion Jones and Luke Darlow, key figures at the research lab Sakana AI, join the show to make this provocative argument, and also introduce new research which might lead the way forwards.


**SPONSOR MESSAGES START**

Build your ideas with AI Studio from Google - http://ai.studio/build

Tufa AI Labs is hiring ML Research Engineers https://tufalabs.ai/

cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy

Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst

Submit investment deck: https://cyber.fund/contact?utm_source=mlst

**END**


The "Spiral" Problem – Llion uses a striking visual analogy to explain what current AI is missing. If you ask a standard neural network to understand a spiral shape, it solves it by drawing tiny straight lines that just happen to look like a spiral. It "fakes" the shape without understanding the concept of spiraling.


Introducing the Continuous Thought Machine (CTM) Luke Darlow deep dives into their solution: a biology-inspired model that fundamentally changes how AI processes information.


The Maze Analogy: Luke explains that standard AI tries to solve a maze by staring at the whole image and guessing the entire path instantly. Their new machine "walks" through the maze step-by-step.

Thinking Time: This allows the AI to "ponder." If a problem is hard, the model can naturally spend more time thinking about it before answering, effectively allowing it to correct its own mistakes and backtrack—something current Language Models struggle to do genuinely.


https://sakana.ai/

https://x.com/YesThisIsLion

https://x.com/LearningLukeD


TRANSCRIPT:

https://app.rescript.info/public/share/crjzQ-Jo2FQsJc97xsBdfzfOIeMONpg0TFBuCgV2Fu8


TOC:

00:00:00 - Stepping Back from Transformers

00:00:43 - Introduction to Continuous Thought Machines (CTM)

00:01:09 - The Changing Atmosphere of AI Research

00:04:13 - Sakana’s Philosophy: Research Freedom

00:07:45 - The Local Minimum of Large Language Models

00:18:30 - Representation Problems: The Spiral Example

00:29:12 - Technical Deep Dive: CTM Architecture

00:36:00 - Adaptive Computation & Maze Solving

00:47:15 - Model Calibration & Uncertainty

01:00:43 - Sudoku Bench: Measuring True Reasoning



REFS:

Why Greatness Cannot be planned [Kenneth Stanley]

https://www.amazon.co.uk/Why-Greatness-Cannot-Planned-Objective/dp/3319155237

https://www.youtube.com/watch?v=lhYGXYeMq_E


The Hardware Lottery [Sara Hooker]

https://arxiv.org/abs/2009.06489

https://www.youtube.com/watch?v=sQFxbQ7ade0


Continuous Thought Machines [Luke Darlow et al / Sakana]

https://arxiv.org/abs/2505.05522

https://sakana.ai/ctm/


LSTM: The Comeback Story? [Prof. Sepp Hochreiter]

https://www.youtube.com/watch?v=8u2pW2zZLCs


Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis [Kumar/Stanley]

https://arxiv.org/pdf/2505.11581


A Spline Theory of Deep Networks [Randall Balestriero]

https://proceedings.mlr.press/v80/balestriero18b/balestriero18b.pdf

https://www.youtube.com/watch?v=86ib0sfdFtw

https://www.youtube.com/watch?v=l3O2J3LMxqI


On the Biology of a Large Language Model [Anthropic, Jack Lindsey et al]

https://transformer-circuits.pub/2025/attribution-graphs/biology.html


The ARC Prize 2024 Winning Algorithm [Daniel Franzen and Jan Disselhoff] “The ARChitects”

https://www.youtube.com/watch?v=mTX_sAq--zY


Neural Turing Machine [Graves]

https://arxiv.org/pdf/1410.5401


Adaptive Computation Time for Recurrent Neural Networks [Graves]

https://arxiv.org/abs/1603.08983


Sudoko Bench [Sakana]

https://pub.sakana.ai/sudoku/

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(252)

When AI Decides You're a Threat — Brad Carson

When AI Decides You're a Threat — Brad Carson

Brad Carson was the Army's General Counsel, served two terms in Congress and was Acting Under Secretary of Defense for Personnel and Readiness. He now heads Americans for Responsible Innovation, the A...

31 Maj 1h 20min

Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)

Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)

Michael I. Jordan, described by Science magazine as the most influential computer scientist alive, has never thought of himself as an AI researcher. In this conversation he explains why that distincti...

21 Maj 1h 17min

 The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]

The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]

Beth Barnes and David Rein on the one graph that ate the AI timelines discourse, and why the two people who built it are the most careful about how you read it.**SPONSOR**Prolific - Quality data. From...

4 Maj 1h 53min

When AI Discovers The Next Transformer - Robert Lange (Sakana)

When AI Discovers The Next Transformer - Robert Lange (Sakana)

Robert Lange, founding researcher at Sakana AI, joins Tim to discuss *Shinka Evolve* — a framework that combines LLMs with evolutionary algorithms to do open-ended program search. The core claim: syst...

13 Mars 1h 18min

"Vibe Coding is a Slot Machine" - Jeremy Howard

"Vibe Coding is a Slot Machine" - Jeremy Howard

Dive into the realities of AI-assisted coding, the origins of modern fine-tuning, and the cognitive science behind machine learning with fast.ai founder Jeremy Howard. In this episode, we unpack why A...

3 Mars 1h 26min

 Evolution "Doesn't Need" Mutation - Blaise Agüera y Arcas

Evolution "Doesn't Need" Mutation - Blaise Agüera y Arcas

What if life itself is just a really sophisticated computer program that wrote itself into existence?Blaise Agüera y Arcas presenting at ALife 2025 — the most technically detailed public walkthrough o...

16 Feb 55min

VAEs Are Energy-Based Models? [Dr. Jeff Beck]

VAEs Are Energy-Based Models? [Dr. Jeff Beck]

What makes something truly *intelligent?* Is a rock an agent? Could a perfect simulation of your brain actually *be* you? In this fascinating conversation, Dr. Jeff Beck takes us on a journey through ...

25 Jan 46min

Abstraction & Idealization: AI's Plato Problem [Mazviita Chirimuuta]

Abstraction & Idealization: AI's Plato Problem [Mazviita Chirimuuta]

Professor Mazviita Chirimuuta joins us for a fascinating deep dive into the philosophy of neuroscience and what it really means to understand the mind.*What can neuroscience actually tell us about how...

23 Jan 53min

Populärt inom Teknik

uppgang-och-fall
market-makers
elbilsveckan
rss-elektrikerpodden
rss-laddstationen-med-elbilen-i-sverige
developers-mer-an-bara-kod
bli-saker-podden
rss-technokratin
bilar-med-sladd
rss-veckans-ai
natets-morka-sida
skogsforum-podcast
hej-bruksbil
bosse-bildoktorn-och-hasse-p
rss-uppgang-och-fall
rss-it-sakerhetspodden
rss-powerboat-sverige-podcast
rss-snacka-om-ai
ai-sweden-podcast
rss-en-ai-till-kaffet