Arjun Patel on Vector Databases and the Future of Semantic Search
Data Driven21 Tammi 2025

Arjun Patel on Vector Databases and the Future of Semantic Search

Today, we delve into the intriguing world of vector databases, retrieval augmented generation, and a surprising twist—origami.

Our special guest, Arjun Patel, a developer advocate at Pinecone, will be walking us through his mission to make vector databases and semantic search more accessible. Alongside his impressive technical expertise, Arjun is also a self-taught origami artist with a background in statistics from the University of Chicago. Together with co-host Frank La Vigne, we explore Arjun’s unique journey from making speech coaching accessible with AI at Speeko to detecting AI-generated content at Appen.

In this episode, get ready to unravel the mysteries of natural language processing, understand the impact of the attention mechanism in transformers, and discover how AI can even assist in the art of paper folding. From discussing the nuances of RAG systems to sharing personal insights on learning and technology, we promise a session that’s both enlightening and entertaining. So sit back, relax, and get ready to fold your way into the fascinating layers of AI with Arjun Patel on Data Driven.


Show Notes

00:00 Arjun Patel: Bridging AI & Education

04:39 Traditional NLP and Geometric Models

08:40 Co-occurrence and Meaning in Text

13:14 Masked Language Modeling Success

16:50 Understanding Tokenization in AI Models

18:12 "Understanding Large Language Models"

22:43 Instruction-Following vs Few-Shot Learning

26:43 "Rel AI: Open Source Data Tool"

31:14 "Retrieval-Augmented Generation Explained"

33:58 "Pinecone: Efficient Vector Database"

37:31 "AI Found Me: Intern to Innovator"

41:10 "Impact of Code Generation Models"

45:25 Personalized Learning Path Technology

46:57 Mathematical Complexity in Origami Design

50:32 "Data, AI, and Origami Insights"

Jaksot(300)

Synthetic Populations and the Future of Decision Intelligence

Synthetic Populations and the Future of Decision Intelligence

In this episode of Data Driven, Frank and Andy dive into the future of market intelligence with Dr. Jill Axline, co-founder and CEO of Mavera—a company building synthetic populations that simulate rea...

29 Tammi 50min

Microsoft Fabric Unpacked: AI, Data Sovereignty, and a Bit of Clippy Nostalgia

Microsoft Fabric Unpacked: AI, Data Sovereignty, and a Bit of Clippy Nostalgia

In today’s show, BAILeY, your semi-sentient hostess with the mostest metadata, teams up with Frank La Vigne to welcome the ever-insightful Andrew Brust for a deep dive into the evolving Microsoft data...

12 Tammi 54min

Celebrating 400 Episodes – How AI Turbocharges Coding, Podcasting, and Creativity

Celebrating 400 Episodes – How AI Turbocharges Coding, Podcasting, and Creativity

Welcome to a milestone episode of Data Driven! In episode 400, hosts BAILeY, Frank La Vigne, and Andy Leonard gather to reflect on nearly a decade at the forefront of podcasting about data, AI, and th...

8 Tammi 1h

The Real Risks of LLMs - Guardrails, Judgment, and the Human Element in Cybersecurity

The Real Risks of LLMs - Guardrails, Judgment, and the Human Element in Cybersecurity

In this episode of Data Driven, hosts Frank La Vigne, Candace Gillhoolley, and BAILeY sit down with Mike Armistead, CEO of Pulse Security AI—a cybersecurity veteran who's been fortifying digital defen...

26 Marras 202558min

Going From Spreadsheets to Smart Agents - Modernizing Supply Chain Intelligence

Going From Spreadsheets to Smart Agents - Modernizing Supply Chain Intelligence

In this episode, Frank La Vigne sits down with Itay Haber, CEO of Data Noetic, to unpack how AI is revolutionizing supply chain management. Forget spreadsheets and dashboards—Data Noetic is building a...

19 Marras 202558min

Inside Nvidia GTC DC: AI, Quantum Computing, Robotics, and the Future of Supercomputers

Inside Nvidia GTC DC: AI, Quantum Computing, Robotics, and the Future of Supercomputers

Welcome to another exciting episode of Data Driven! On this week’s show, hosts Frank La Vigne and Candace Gillhoolley take you inside the NVIDIA GTC conference in Washington, DC—an event that’s rapidl...

30 Loka 202554min

The Fast-Moving Train of AI - Sovereignty, Acceleration, & Lessons from History

The Fast-Moving Train of AI - Sovereignty, Acceleration, & Lessons from History

On this episode of Data Driven, hosts Frank La Vigne and Leonard celebrate a major milestone: the 30th anniversary of Franksworld.com, one of the OGs of tech blogging that’s survived multiple browser ...

13 Loka 20251h 15min

Compute, Carbon, and Cashflow Silicon Data’s Big Bet on GPU Markets

Compute, Carbon, and Cashflow Silicon Data’s Big Bet on GPU Markets

Welcome to another episode of Data Driven, where we dive deep into how data and AI are shaping—sometimes shaking—the modern world. In this episode, hosts Frank La Vigne, Andy Leonard, and Carmen Li si...

1 Loka 202550min

Suosittua kategoriassa Tiede

rss-mita-tulisi-tietaa
rss-poliisin-mieli
tiedekulma-podcast
rss-lihavuudesta-podcast
utelias-mieli
rss-duodecim-lehti
rss-laakaripodi
rss-opeklubi
docemilia
hippokrateen-vastaanotolla
mielipaivakirja
radio-antro
rss-mental-race
rss-ylistys-elaimille