Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724

Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724

Today, we're joined by Julie Kallini, PhD student at Stanford University to discuss her recent papers, “MrT5: Dynamic Token Merging for Efficient Byte-level Language Models” and “Mission: Impossible Language Models.” For the MrT5 paper, we explore the importance and failings of tokenization in large language models—including inefficient compression rates for under-resourced languages—and dig into byte-level modeling as an alternative. We discuss the architecture of MrT5, its ability to learn language-specific compression rates, its performance on multilingual benchmarks and character-level manipulation tasks, and its performance and efficiency. For the “Mission: Impossible Language Models” paper, we review the core idea behind the research, the definition and creation of impossible languages, the creation of impossible language training datasets, and explore the bias of language model architectures towards natural language. The complete show notes for this episode can be found at https://twimlai.com/go/724.

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(788)

Why AI Agents Break the GenAI Security Model with Devvret Rishi - #770

Why AI Agents Break the GenAI Security Model with Devvret Rishi - #770

In this episode, Sam talks with Dev Rishi, GM of AI at Rubrik, about what happens when agents move beyond answering questions and start taking action across tools, systems, and business processes. We...

16 Jun 56min

Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769

Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769

As context windows grow into the millions of tokens, many AI practitioners are questioning whether retrieval-augmented generation (RAG) is still necessary. If modern models can ingest entire libraries...

9 Jun 51min

Relational Foundation Models for Enterprise Data with Jure Leskovec - #768

Relational Foundation Models for Enterprise Data with Jure Leskovec - #768

In this episode, Jure Leskovec, co-founder and chief scientist at Kumo and professor of computer science at Stanford, joins us to explore two fronts of his work: AI for science and relational deep lea...

21 Mai 1h 6min

How to Find the Agent Failures Your Evals Miss with Scott Clark - #767

How to Find the Agent Failures Your Evals Miss with Scott Clark - #767

In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Masl...

7 Mai 53min

How to Engineer AI Inference Systems with Philip Kiely - #766

How to Engineer AI Inference Systems with Philip Kiely - #766

In this episode, Philip Kiely, head of AI education at Baseten, joins us to unpack the fast-evolving discipline of inference engineering. We explore why inference has become the stickiest and most cri...

30 Apr 54min

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

In this episode, Rashmi Shetty, senior director of enterprise generative AI platform at Capital One, joins us to explore how the company is designing, deploying, and scaling multi-agent systems in a h...

16 Apr 54min

The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764

The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764

Today, we're joined by Stefano Ermon, associate professor at Stanford University and CEO of Inception Labs to discuss diffusion language models. We dig into how diffusion approaches—traditionally used...

26 Mar 1h 3min

Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763

Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763

In this episode, Sid Pardeshi, co-founder and CTO of Blitzy, joins us to discuss building autonomous development systems able to deliver production-ready software at enterprise scale. Sid contrasts AI...

10 Mar 1h 16min

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
fotballpodden-2
forklart
stopp-verden
popradet
nokon-ma-ga
lydartikler-fra-aftenposten
rss-espen-lee-usensurert
det-store-bildet
rss-gukild-johaug
dine-penger-pengeradet
rss-ness
aftenbla-bla
hanna-de-heldige
rss-utenrikskomiteen-med-bogen-og-grasvik
frokostshowet-pa-p5
e24-podden
rss-penger-polser-og-politikk