The database for all your AI needs
Database School16 Syys 2025

The database for all your AI needs

Marcel Kornacker, the creator of Apache Impala and co-creator of Apache Parquet, joins me to talk about his latest project: Pixeltable, a multimodal AI database that combines structured and unstructured data with rich, Python-native workflows.


From ingestion to vector search, transcription to snapshots, Pixeltable eliminates painful data plumbing for modern AI teams.



Follow Marcel

  • Pixeltable: https://pixeltable.com
  • Pixeltable GitHub: https://github.com/pixeltable/pixeltable
  • LinkedIn: https://www.linkedin.com/in/marcelkornacker



Follow Aaron

  • Twitter: https://twitter.com/aarondfrancis
  • LinkedIn: https://www.linkedin.com/in/aarondfrancis
  • Website: https://aaronfrancis.com – find articles, podcasts, courses, and more
  • Database School: https://databaseschool.com



Chapters

  • 0:00 – Introduction
  • 0:20 – Meet Marcel Kornacker
  • 1:19 – Early career and grad school in databases
  • 2:12 – Joining Google and building F1
  • 3:42 – How F1 used Spanner at Google
  • 4:01 – Starting Apache Impala at Cloudera
  • 6:02 – Why SQL still matters
  • 7:29 – What keeps Marcel fascinated with databases
  • 9:37 – The “SQL is dead” waves and shift to AI
  • 10:21 – Observing pain points in computer vision pipelines
  • 13:02 – Multimodal data challenges and the idea for Pixeltable
  • 16:10 – How Pixeltable handles transformations with computed columns
  • 26:29 – Example: processing video, audio, and transcripts in Pixeltable
  • 33:12 – DAG execution and parallelism explained
  • 37:00 – Transactional guarantees in Pixeltable
  • 39:00 – Iterators and chunking data for search
  • 42:26 – Using embeddings and semantic search
  • 47:05 – Updating data and incremental recomputation
  • 50:06 – Thoughts on RAG and hybrid search
  • 53:14 – Real-world use cases and dataset curation
  • 57:00 – Example: labeling food waste on cruise ships
  • 1:02:00 – Labeling workflows and syncing annotations
  • 1:02:41 – Pixeltable’s roadmap and cloud vision
  • 1:07:10 – How to get involved with Pixeltable
  • 1:09:03 – Closing and where to find Marcel

Jaksot(30)

Infinite, shareable volume storage with Hunter Leath, Archil CEO

Infinite, shareable volume storage with Hunter Leath, Archil CEO

Hunter Leath, CEO of Archil, explains how they’re building a “universal storage engine” that sits between your apps and S3—making an S3 bucket behave like a fast, POSIX-compatible disk for containers,...

15 Tammi 55min

Building search for AI systems with Chroma CTO Hammad Bashir

Building search for AI systems with Chroma CTO Hammad Bashir

Hammad Bashir, CTO of Chroma, joins the show to break down how modern vector search systems are actually built from local, embedded databases to massively distributed, object-storage-backed architectu...

18 Joulu 20251h 6min

Scaling DuckDB in the cloud with MotherDuck CEO Jordan Tigani

Scaling DuckDB in the cloud with MotherDuck CEO Jordan Tigani

In this episode of Database School, Aaron Francis sits down with Jordan Tigani, co-founder and CEO of MotherDuck, to break down what DuckDB is, how MotherDuck hosts it in the cloud, and why analytics ...

11 Joulu 20251h 5min

Just use Postgres with Denis Magda

Just use Postgres with Denis Magda

In this episode, Aaron talks with Dennis Magda, author of Just Use Postgres!, about the wide world of modern Postgres, from JSON and full-text search to generative AI, time-series storage, and even me...

4 Joulu 20251h 7min

Strictly typed SQL with Contra CTO, Gajus Kuizinas

Strictly typed SQL with Contra CTO, Gajus Kuizinas

In this episode, Gajus Kuizinas, co-founder and CTO of Contra, joins Aaron to talk about building the engineering world you want to live in, from strict runtime-validated SQL with Slonik to creating h...

20 Marras 202559min

Building serverless vector search with Turbopuffer CEO, Simon Eskildsen

Building serverless vector search with Turbopuffer CEO, Simon Eskildsen

In this episode, Aaron Francis talks with Simon Eskildsen, co-founder and CEO of TurboPuffer, about building a high-performance search engine and database that runs entirely on object storage. They di...

13 Marras 20251h 6min

Building an S3 Competitor with Tigris CEO Ovais Tariq

Building an S3 Competitor with Tigris CEO Ovais Tariq

Aaron talks with Ovais Tariq, co-founder and CEO of Tigris Data and former Uber engineer who helped scale one of the world’s largest distributed systems. They discuss Uber’s hyperscale infrastructure,...

6 Marras 20251h 7min

Rewriting SQLite from prison with Preston Thorpe

Rewriting SQLite from prison with Preston Thorpe

In this episode of Database School, Aaron talks with Preston Thorpe, a senior engineer at Turso who is currently incarcerated, about his incredible journey from prison to rewriting SQLite in Rust. The...

30 Loka 20251h 18min

Suosittua kategoriassa Koulutus

rss-murhan-anatomia
voi-hyvin-meditaatiot-2
psykopodiaa-podcast
adhd-podi
rss-narsisti
psykologia
kesken
rahapuhetta
rss-niinku-asia-on
rss-vapaudu-voimaasi
rss-liian-kuuma-peruna
rss-duodecim-lehti
rss-luonnollinen-synnytys-podcast
rss-tietoinen-yhteys-podcast-2
aamukahvilla
rss-uskonto-on-tylsaa
rss-valo-minussa-2
rss-honest-talk-with-laurrenna
rss-opeklubi
rss-psykalab