How Training Data Differentiates Falcon, the LLM from the UAE

How Training Data Differentiates Falcon, the LLM from the UAE

The name "Falcon" for the UAE’s large language model (LLM) symbolizes the national bird's qualities of courage and perseverance, reflecting the vision of the Technology Innovation Institute (TII) in Abu Dhabi. TII, launched in 2020, addresses AI’s rapid advancements and unintended consequences by fostering an open-source approach to enhance community understanding and control of AI. In this New Stack Makers, Dr. Hakim Hacid, Executive Director and Acting Chief Researcher, Technology Innovation Institute emphasized the importance of perseverance and innovation in overcoming challenges. Falcon gained attention for being the first truly open model with capabilities matching many closed-source models, opening new possibilities for practitioners and industry.

Last June, Falcon introduced a 40-billion parameter model, outperforming the LLaMA-65B, with smaller models enabling local inference without the cloud. The latest 180-billion parameter model, trained on 3.5 trillion tokens, illustrates Falcon’s commitment to quality and efficiency over sheer size. Falcon’s distinctiveness lies in its data quality, utilizing over 80% RefinedWeb data, based on CommonCrawl, which ensures cleaner and deduplicated data, resulting in high-quality outcomes. This data-centric approach, combined with powerful computational resources, sets Falcon apart in the AI landscape.

Learn more from The New Stack about Open Source AI:

Open Source Initiative Hits the Road to Define Open Source AI

Linus Torvalds on Security, AI, Open Source and Trust

Transparency and Community: An Open Source Vision for AI

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Det här avsnittet är hämtat från ett öppet RSS-flöde och publiceras inte av Podme. Det kan innehålla reklam.

Avsnitt(300)

Why MotherDuck refuses to fork DuckDB

Why MotherDuck refuses to fork DuckDB

At a recent MCP developer summit, The New Stack spoke with Till Döhmen, AI lead atMotherDuck, about the company’s growing role in the evolving DuckDB ecosystem. Backed by investors includingTomasz Tun...

27 Maj 27min

JetBrains is selling independence as the rest of AI coding picks sides

JetBrains is selling independence as the rest of AI coding picks sides

JetBrains is positioning itself as the last major independent AI coding-tool vendor in a market increasingly tied to hyperscalers and foundation model labs. Speaking at Google Cloud Next, JetBrains VP...

21 Maj 26min

Why Block handed Goose to the Linux Foundation

Why Block handed Goose to the Linux Foundation

What began as an internal developer tool atBlockhas evolved into a broader open-source initiative with industry backing. Goose, Block’s AI coding agent, followed a path similar to Amazon’s transformat...

15 Maj 19min

Fivetran's CPO: closed data stacks won't survive the agent era

Fivetran's CPO: closed data stacks won't survive the agent era

At Google Cloud Next 2026, Fivetran Chief Product Officer Anjan Kundavaram argued that enterprise data systems are unprepared for the scale of AI-driven analytics. Unlike humans, AI agents can generat...

13 Maj 22min

The new FinOps problem isn't cloud bills

The new FinOps problem isn't cloud bills

At Google Cloud Next 2026, Finout co-founder and CEO Roi Ravhon and Google Cloud FinOps lead Pathik Sharma discussed how FinOps is rapidly evolving for the AI era. Ravhon argued that while cloud FinOp...

12 Maj 28min

How Microsoft is governing thousands of Kubernetes clusters without manual intervention

How Microsoft is governing thousands of Kubernetes clusters without manual intervention

Managing Kubernetes at fleet scale introduces significant complexity, especially as organizations expand from a few clusters to hundreds or thousands across cloud, on-premises, and edge environments. ...

7 Maj 25min

Why long-running AI agents break on HTTP and how Ably is fixing it

Why long-running AI agents break on HTTP and how Ably is fixing it

In this episode ofThe New Stack Makers, Matthew O’Riordan, CEO of Ably, explains how infrastructure originally built for human collaboration is now well-suited for long-running AI agents. While Ably i...

6 Maj 31min

Why the Linux Foundation adopted MCP, with Jim Zemlin and Mazin Gilbert

Why the Linux Foundation adopted MCP, with Jim Zemlin and Mazin Gilbert

Agentic AI is advancing rapidly, with open-source projects racing to keep pace with real-world deployment. To accelerate progress, the Linux Foundation consolidated key technologies—Model Context Prot...

6 Maj 32min

Populärt inom Politik & nyheter

svenska-fall
motiv
aftonbladet-krim
p3-krim
spar
aftonbladet-daily
flashback-forever
rss-sanning-konsekvens
rss-expressen-dok
rss-krimreportrarna
rss-flodet
politiken
rss-frandfors-horna
rss-vad-fan-hande
olyckan-inifran
rss-aftonbladet-krim
svd-ledarredaktionen
kungligt
dagens-eko
rss-krimstad