Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced Mercury 2, a large language model built on diffusion rather than the standard autoregressive approach. Traditional LLMs generate text token by token from left to right, which Ermon describes as “fancy autocomplete.” In contrast, diffusion models begin with a rough draft and refine it in parallel, similar to image systems like Stable Diffusion.

This parallel process allows Mercury 2 to produce over 1,000 tokens per second—five to ten times faster than optimized models from labs such as OpenAI, Anthropic, and Google, according to company tests. Ermon argues diffusion models better leverage GPUs, with support from investor Nvidia to optimize performance.

While Mercury 2 matches mid-tier models like Claude Haiku and Google Flash rather than top systems such as Claude Opus or GPT-4, Ermon believes diffusion’s speed and economic advantages will become increasingly compelling as AI applications scale.

Learn more from The New Stack about the latest developments around around large language model built on diffusion:

How Diffusion-Based LLM AI Speeds Up Reasoning

Get Ready for Faster Text Generation With Diffusion LLMs

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(300)

How SUSE positions itself as the infrastructure layer for the AI era

How SUSE positions itself as the infrastructure layer for the AI era

In this episode ofThe New Stack Makers,Pete Smailsoutlines howSUSEis evolving from its Linux roots into an AI-native infrastructure platform. Speaking atKubeCon + CloudNativeCon Europe 2026, Smails ex...

30 Apr 26min

Cut AI token usage by 96%? Here’s how AWS Strands Agents does it.

Cut AI token usage by 96%? Here’s how AWS Strands Agents does it.

In this episode of The New Stack Makers, AWS developer advocate Morgan Willis demonstrates Strands Agents, an open source agentic framework with rapid adoption since its launch. Using a simple account...

29 Apr 28min

Why Broadcom is betting on a private cloud comeback

Why Broadcom is betting on a private cloud comeback

Broadcom’s VMware Cloud Foundation (VCF) is evolving from a turnkey infrastructure stack into a modern application platform, balancing simplicity with the flexibility demanded by Kubernetes-driven env...

28 Apr 23min

Why Broadcom gave Velero to the CNCF Sandbox — and what it means for Kubernetes data protection

Why Broadcom gave Velero to the CNCF Sandbox — and what it means for Kubernetes data protection

Broadcom continues to expand its role as a major contributor to cloud-native open source, particularly within the Cloud Native Computing Foundation (CNCF) ecosystem. Its recent donation of Velero—orig...

25 Apr 22min

Why AI engineering needs old-school discipline

Why AI engineering needs old-school discipline

In this episode of The New Stack Makers, Nimisha Asthagiri of Thoughtworks explores why many AI initiatives stall between proof of concept and production. A key issue is that organizations focus on sp...

24 Apr 24min

Jim Bugwadia on why finding a Kubernetes problem is only half the battle for Kyverno users

Jim Bugwadia on why finding a Kubernetes problem is only half the battle for Kyverno users

Graduating within the CNCF marks a major milestone for an open source project, signaling not just technical maturity but strong governance, security practices, and widespread adoption. Kyverno, a Kube...

23 Apr 23min

How AWS Bedrock is shaping Model Context Protocol

How AWS Bedrock is shaping Model Context Protocol

At the MCP Summit in New York City, AWS’s Luca Chang, a Bedrock team member and MCP specification maintainer, discussed the rapid rise of the Model Context Protocol (MCP) as a standard for connecting ...

22 Apr 31min

Why Microsoft is betting on temporary identities to stop autonomous agents from going rogue

Why Microsoft is betting on temporary identities to stop autonomous agents from going rogue

AtKubeCon Europe 2026,Jorge Palmaoutlined how Microsoft is advancing AI operations across cloud and edge environments. He demonstrated an agent capable of diagnosing, mitigating, and explaining applic...

21 Apr 24min

Populärt inom Politik & nyheter

aftonbladet-krim
svenska-fall
motiv
p3-krim
flashback-forever
aftonbladet-daily
politiken
rss-sanning-konsekvens
rss-krimreportrarna
rss-flodet
rss-vad-fan-hande
rss-frandfors-horna
svd-ledarredaktionen
rss-aftonbladet-krim
grans
krimmagasinet
spar
dagens-eko
rss-krimstad
blenda-2