177: Vector Databases

177: Vector Databases

Intro topic: Buying a Car

News/Links:

Book of the Show


Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h


Tool of the Show

Topic: Vector Databases (~54 min)

  • How computers represent data traditionally
    • ASCII values
    • RGB values
  • How traditional compression works
    • Huffman encoding (tree structure)
    • Lossy example: Fourier Transform & store coefficients
  • How embeddings are computed
    • Pairwise (contrastive) methods
    • Forward models (self-supervised)
  • Similarity metrics
  • Approximate Nearest Neighbors (ANN)
  • Sub-Linear ANN
    • Clustering
    • Space Partitioning (e.g. K-D Trees)
  • What a vector database does
    • Perform nearest-neighbors with many different similarity metrics
    • Store the vectors and the data structures to support sub-linear ANN
    • Handle updates, deletes, rebalancing/reclustering, backups/restores
  • Examples
    • pgvector: a vector-database plugin for postgres
    • Weaviate, Pinecone
    • Milvus

★ Support this podcast on Patreon ★

Episoder(186)

Populært innen Politikk og nyheter

giver-og-gjengen-vg
aftenpodden
aftenpodden-usa
forklart
popradet
stopp-verden
dine-penger-pengeradet
det-store-bildet
bt-dokumentar-2
nokon-ma-ga
lydartikler-fra-aftenposten
fotballpodden-2
frokostshowet-pa-p5
rss-gukild-johaug
rss-ness
e24-podden
rss-penger-polser-og-politikk
tut-mediekjr
aftenbla-bla
rss-dannet-uten-piano