The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.

Oppdag Premium

Prøv 14 dager gratis

Prøv gratisArrow Right

Episoder(758)

Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis - #740

Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis - #740

In this episode, Jared Quincy Davis, founder and CEO at Foundry, introduces the concept of "compound AI systems," which allows users to create powerful, efficient applications by composing multiple, often diverse, AI models and services. We discuss how these "networks of networks" can push the Pareto frontier, delivering results that are simultaneously faster, more accurate, and even cheaper than single-model approaches. Using examples like "laconic decoding," Jared explains the practical techniques for building these systems and the underlying principles of inference-time scaling. The conversation also delves into the critical role of co-design, where the evolution of AI algorithms and the underlying cloud infrastructure are deeply intertwined, shaping the future of agentic AI and the compute landscape. The complete show notes for this episode can be found at https://twimlai.com/go/740.

22 Jul 1h 13min

Building Voice AI Agents That Don’t Suck with Kwindla Kramer - #739

Building Voice AI Agents That Don’t Suck with Kwindla Kramer - #739

In this episode, Kwindla Kramer, co-founder and CEO of Daily and creator of the open source Pipecat framework, joins us to discuss the architecture and challenges of building real-time, production-ready conversational voice AI. Kwin breaks down the full stack for voice agents—from the models and APIs to the critical orchestration layer that manages the complexities of multi-turn conversations. We explore why many production systems favor a modular, multi-model approach over the end-to-end models demonstrated by large AI labs, and how this impacts everything from latency and cost to observability and evaluation. Kwin also digs into the core challenges of interruption handling, turn-taking, and creating truly natural conversational dynamics, and how to overcome them. We discuss use cases, thoughts on where the technology is headed, the move toward hybrid edge-cloud pipelines, and the exciting future of real-time video avatars, and much more. The complete show notes for this episode can be found at https://twimlai.com/go/739.

15 Jul 1h 13min

Distilling Transformers and Diffusion Models for Robust Edge Use Cases with Fatih Porikli - #738

Distilling Transformers and Diffusion Models for Robust Edge Use Cases with Fatih Porikli - #738

Today, we're joined by Fatih Porikli, senior director of technology at Qualcomm AI Research for an in-depth look at several of Qualcomm's accepted papers and demos featured at this year’s CVPR conference. We start with “DiMA: Distilling Multi-modal Large Language Models for Autonomous Driving,” an end-to-end autonomous driving system that incorporates distilling large language models for structured scene understanding and safe planning motion in critical "long-tail" scenarios. We explore how DiMA utilizes LLMs' world knowledge and efficient transformer-based models to significantly reduce collision rates and trajectory errors. We then discuss “SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation,” a diffusion-distilled approach that combines generative models with metric depth estimation to produce sharp, accurate monocular depth maps. Additionally, Fatih also shares a look at Qualcomm’s on-device demos, including text-to-3D mesh generation, real-time image-to-video and video-to-video generation, and a multi-modal visual question-answering assistant. The complete show notes for this episode can be found at https://twimlai.com/go/738.

9 Jul 1h

Building the Internet of Agents with Vijoy Pandey - #737

Building the Internet of Agents with Vijoy Pandey - #737

Today, we're joined by Vijoy Pandey, SVP and general manager at Outshift by Cisco to discuss a foundational challenge for the enterprise: how do we make specialized agents from different vendors collaborate effectively? As companies like Salesforce, Workday, and Microsoft all develop their own agentic systems, integrating them creates a complex, probabilistic, and noisy environment, a stark contrast to the deterministic APIs of the past. Vijoy introduces Cisco's vision for an "Internet of Agents," a platform to manage this new reality, and its open-source implementation, AGNTCY. We explore the four phases of agent collaboration—discovery, composition, deployment, and evaluation—and dive deep into the communication stack, from syntactic protocols like A2A, ACP, and MCP to the deeper semantic challenges of creating a shared understanding between agents. Vijoy also unveils SLIM (Secure Low-Latency Interactive Messaging), a novel transport layer designed to make agent-to-agent communication quantum-safe, real-time, and efficient for multi-modal workloads. The complete show notes for this episode can be found at ⁠https://twimlai.com/go/737.

24 Jun 56min

LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington - #736

LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington - #736

Today, we're joined by Ben Wellington, deputy head of feature forecasting at Two Sigma. We dig into the team’s end-to-end approach to leveraging AI in equities feature forecasting, covering how they identify and create features, collect and quantify historical data, and build predictive models to forecast market behavior and asset prices for trading and investment. We explore the firm's platform-centric approach to managing an extensive portfolio of features and models, the impact of multimodal LLMs on accelerating the process of extracting novel features, the importance of strict data timestamping to prevent temporal leakage, and the way they consider build vs. buy decisions in a rapidly evolving landscape. Lastly, Ben also shares insights on leveraging open-source models and the future of agentic AI in quantitative finance. The complete show notes for this episode can be found at https://twimlai.com/go/736.

17 Jun 59min

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735

Today, we're joined by Jason Corso, co-founder of Voxel51 and professor at the University of Michigan, to explore automated labeling in computer vision. Jason introduces FiftyOne, an open-source platform for visualizing datasets, analyzing models, and improving data quality. We focus on Voxel51’s recent research report, “Zero-shot auto-labeling rivals human performance,” which demonstrates how zero-shot auto-labeling with foundation models can yield to significant cost and time savings compared to traditional human annotation. Jason explains how auto-labels, despite being "noisier" at lower confidence thresholds, can lead to better downstream model performance. We also cover Voxel51's "verified auto-labeling" approach, which utilizes a "stoplight" QA workflow (green, yellow, red light) to minimize human review. Finally, we discuss the challenges of handling decision boundary uncertainty and out-of-domain classes, the differences between synthetic data generation in vision and language domains, and the potential of agentic labeling. The complete show notes for this episode can be found at https://twimlai.com/go/735.

10 Jun 56min

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher, an open-source tool for analyzing and improving Deep Neural Networks (DNNs) based on principles from theoretical physics. We explore the foundations of the Heavy-Tailed Self-Regularization (HTSR) theory that underpins it, which combines random matrix theory and renormalization group ideas to uncover deep insights about model training dynamics. Charles walks us through WeightWatcher’s ability to detect three distinct learning phases—underfitting, grokking, and generalization collapse—and how its signature “layer quality” metric reveals whether individual layers are underfit, overfit, or optimally tuned. Additionally, we dig into the complexities involved in fine-tuning models, the surprising correlation between model optimality and hallucination, the often-underestimated challenges of search relevance, and their implications for RAG. Finally, Charles shares his insights into real-world applications of generative AI and his lessons learned from working in the field. The complete show notes for this episode can be found at https://twimlai.com/go/734.

5 Jun 1h 25min

Google I/O 2025 Special Edition - #733

Google I/O 2025 Special Edition - #733

Today, I’m excited to share a special crossover edition of the podcast recorded live from Google I/O 2025! In this episode, I join Shawn Wang aka Swyx from the Latent Space Podcast, to interview Logan Kilpatrick and Shrestha Basu Mallick, PMs at Google DeepMind working on AI Studio and the Gemini API, along with Kwindla Kramer, CEO of Daily and creator of the Pipecat open source project. We cover all the highlights from the event, including enhancements to the Gemini models like thinking budgets and thought summaries, native audio output for expressive voice AI, and the new URL Context tool for research agents. The discussion also digs into the Gemini Live API, covering its architecture, the challenges of building real-time voice applications (such as latency and voice activity detection), and new features like proactive audio and asynchronous function calling. Finally, don’t miss our guests’ wish lists for next year’s I/O! The complete show notes for this episode can be found at https://twimlai.com/go/733.

28 Mai 26min

Reklamefrie Premium-podkaster

Hør populære podkaster som Storefri med Mikkel og Herman, Ida med hjertet i hånden, Krimpodden og mye mye mer

Skap din egen podkastboble

I appen skaper du ditt eget bibliotek med favoritter, og vi gir deg også anbefalinger til podkaster du ikke kan gå glipp av.

Prøv 14 dager gratis

Dersom du er ny Podme-bruker får du 14 dager gratis prøveperiode når du oppretter abonnement

Premium

kr 99/mnd

  • Tilgang til alle Premium-podkaster
  • Alle podkaster fra VG, Aftenposten, BT og SA
  • Tilgang til alle våre Premium-podkaster
  • Ingen bindingstid. Avslutt når du ønsker

Premium

kr 129/mnd

  • Tilgang til alle Premium-podkaster
  • Alle podkaster fra VG, Aftenposten, BT og SA
  • Reklamefritt Premium-innhold
  • Ingen bindingstid. Avslutt når du ønsker
  • En Ekstra bruker

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
forklart
aftenpodden-usa
stopp-verden
popradet
dine-penger-pengeradet
nokon-ma-ga
det-store-bildet
fotballpodden-2
unitedno
aftenbla-bla
e24-podden
rss-penger-polser-og-politikk
rss-ness
rss-fredrik-og-zahid-loser-ingenting
bt-dokumentar-2
oppdatert
ukrainapodden
rss-borsmorgen-okonominyhetene

Historiene og stemmene du vil høre

Ubegrenset tilgang til alle dine favorittpodkaster

Les merArrow Right