Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced Mercury 2, a large language model built on diffusion rather than the standard autoregressive approach. Traditional LLMs generate text token by token from left to right, which Ermon describes as “fancy autocomplete.” In contrast, diffusion models begin with a rough draft and refine it in parallel, similar to image systems like Stable Diffusion.

This parallel process allows Mercury 2 to produce over 1,000 tokens per second—five to ten times faster than optimized models from labs such as OpenAI, Anthropic, and Google, according to company tests. Ermon argues diffusion models better leverage GPUs, with support from investor Nvidia to optimize performance.

While Mercury 2 matches mid-tier models like Claude Haiku and Google Flash rather than top systems such as Claude Opus or GPT-4, Ermon believes diffusion’s speed and economic advantages will become increasingly compelling as AI applications scale.

Learn more from The New Stack about the latest developments around around large language model built on diffusion:

How Diffusion-Based LLM AI Speeds Up Reasoning

Get Ready for Faster Text Generation With Diffusion LLMs

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(300)

As agentic AI explodes, Amazon doubles down on MCP

As agentic AI explodes, Amazon doubles down on MCP

At the MCP Summit inNew York City,Clare LiguoriofAmazon Web Servicesdiscussed the rapid rise of theModel Context Protocol(MCP), now a leading way to connect AI agents with tools and data. Originally d...

16 Huhti 24min

A year in, Google wants its Axion processors to feel like a scheduling decision

A year in, Google wants its Axion processors to feel like a scheduling decision

At KubeCon Europe, Google Cloud’s Jago Macleod and Abdel Sghiouar argued that adopting Arm for Kubernetes workloads has shifted from a complex migration to a practical, low-friction choice. After a ye...

15 Huhti 22min

Can you make Kubernetes invisible? Here's why AWS is on a mission to do it.

Can you make Kubernetes invisible? Here's why AWS is on a mission to do it.

In this episode ofThe New Stack Makers, Jesse Butler, principal product manager for AWS Elastic Kubernetes Service, shares his vision for simplifying cloud-native computing. Since joining AWS in 2020,...

14 Huhti 23min

The next stages of AI conformance in the cloud-native, open-source world

The next stages of AI conformance in the cloud-native, open-source world

Running AI models on Kubernetes has historically been inconsistent, with workloads behaving differently across cloud providers due to variations in GPUs, networking, and autoscaling. As organizations ...

9 Huhti 25min

Microsoft wants to make service mesh invisible

Microsoft wants to make service mesh invisible

At KubeCon EU 2026, Mitch Connors of Microsoft outlined a vision to make service meshes effectively invisible to users. Now working on Azure Kubernetes Application Network, a fully managed service bui...

8 Huhti 21min

Amazon EKS Auto Mode wants to end Kubernetes toil — one node at a time

Amazon EKS Auto Mode wants to end Kubernetes toil — one node at a time

At KubeCon + CloudNativeCon Europe 2026 in Amsterdam, Alex Kestner, principal product manager for Amazon Elastic Kubernetes Service (EKS), discussed how Amazon EKS Auto Mode aims to reduce the operati...

7 Huhti 22min

Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI inference

Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI inference

At KubeCon + CloudNativeCon Europe 2026, Lena Hall and Thorsten Hans of Akamai outlined how the company is evolving from a CDN provider into a developer-focused cloud platform for AI. Akamai’s strateg...

1 Huhti 22min

Kubernetes co-founder Brendan Burns: AI-generated code will become as invisible as assembly

Kubernetes co-founder Brendan Burns: AI-generated code will become as invisible as assembly

In this episode of The New Stack Makers, Microsoft Corporate Vice President and Technical Fellow, Brendan Burns discusses how AI is reshaping Kubernetes and modern infrastructure. Originally designed ...

24 Maalis 43min

Suosittua kategoriassa Politiikka ja uutiset

uutiscast
aikalisa
politiikan-puskaradio
rss-ootsa-kuullut-tasta
ootsa-kuullut-tasta-2
rss-vaalirankkurit-podcast
tervo-halme
otetaan-yhdet
rss-podme-livebox
viisupodi
et-sa-noin-voi-sanoo-esittaa
rss-pinnalla
rss-asiastudio
rss-girls-finish-f1rst
linda-maria
rss-raha-talous-ja-politiikka
rss-ulkopoditiikkaa
rikosmyytit
the-ulkopolitist
rss-polikulaari-pitka-kiekko-ja-muut-ts-podcastit