Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced Mercury 2, a large language model built on diffusion rather than the standard autoregressive approach. Traditional LLMs generate text token by token from left to right, which Ermon describes as “fancy autocomplete.” In contrast, diffusion models begin with a rough draft and refine it in parallel, similar to image systems like Stable Diffusion.

This parallel process allows Mercury 2 to produce over 1,000 tokens per second—five to ten times faster than optimized models from labs such as OpenAI, Anthropic, and Google, according to company tests. Ermon argues diffusion models better leverage GPUs, with support from investor Nvidia to optimize performance.

While Mercury 2 matches mid-tier models like Claude Haiku and Google Flash rather than top systems such as Claude Opus or GPT-4, Ermon believes diffusion’s speed and economic advantages will become increasingly compelling as AI applications scale.

Learn more from The New Stack about the latest developments around around large language model built on diffusion:

How Diffusion-Based LLM AI Speeds Up Reasoning

Get Ready for Faster Text Generation With Diffusion LLMs

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(300)

JetBrains is selling independence as the rest of AI coding picks sides

JetBrains is selling independence as the rest of AI coding picks sides

JetBrains is positioning itself as the last major independent AI coding-tool vendor in a market increasingly tied to hyperscalers and foundation model labs. Speaking at Google Cloud Next, JetBrains VP...

21 Maj 26min

Why Block handed Goose to the Linux Foundation

Why Block handed Goose to the Linux Foundation

What began as an internal developer tool atBlockhas evolved into a broader open-source initiative with industry backing. Goose, Block’s AI coding agent, followed a path similar to Amazon’s transformat...

15 Maj 19min

Fivetran's CPO: closed data stacks won't survive the agent era

Fivetran's CPO: closed data stacks won't survive the agent era

At Google Cloud Next 2026, Fivetran Chief Product Officer Anjan Kundavaram argued that enterprise data systems are unprepared for the scale of AI-driven analytics. Unlike humans, AI agents can generat...

13 Maj 22min

The new FinOps problem isn't cloud bills

The new FinOps problem isn't cloud bills

At Google Cloud Next 2026, Finout co-founder and CEO Roi Ravhon and Google Cloud FinOps lead Pathik Sharma discussed how FinOps is rapidly evolving for the AI era. Ravhon argued that while cloud FinOp...

12 Maj 28min

How Microsoft is governing thousands of Kubernetes clusters without manual intervention

How Microsoft is governing thousands of Kubernetes clusters without manual intervention

Managing Kubernetes at fleet scale introduces significant complexity, especially as organizations expand from a few clusters to hundreds or thousands across cloud, on-premises, and edge environments. ...

7 Maj 25min

Why long-running AI agents break on HTTP and how Ably is fixing it

Why long-running AI agents break on HTTP and how Ably is fixing it

In this episode ofThe New Stack Makers, Matthew O’Riordan, CEO of Ably, explains how infrastructure originally built for human collaboration is now well-suited for long-running AI agents. While Ably i...

6 Maj 31min

Why the Linux Foundation adopted MCP, with Jim Zemlin and Mazin Gilbert

Why the Linux Foundation adopted MCP, with Jim Zemlin and Mazin Gilbert

Agentic AI is advancing rapidly, with open-source projects racing to keep pace with real-world deployment. To accelerate progress, the Linux Foundation consolidated key technologies—Model Context Prot...

6 Maj 32min

Fresh data has us asking, does AI demand Kubernetes?

Fresh data has us asking, does AI demand Kubernetes?

Kubernetes is rapidly emerging as the de facto operating system for AI, with two-thirds of organizations using it for generative AI inference and 82% adopting it in production. Its ecosystem — includi...

1 Maj 23min

Populärt inom Politik & nyheter

aftonbladet-krim
svenska-fall
motiv
p3-krim
flashback-forever
aftonbladet-daily
politiken
rss-sanning-konsekvens
rss-krimreportrarna
rss-flodet
rss-vad-fan-hande
rss-frandfors-horna
svd-ledarredaktionen
rss-aftonbladet-krim
grans
krimmagasinet
spar
dagens-eko
rss-krimstad
blenda-2