Qdrant - Vector Database

Points, Vectors, Payloads

November 2022. OpenAI ships ChatGPT. Within 60 days, every team in tech is scrambling to build RAG - Retrieval-Augmented Generation. The recipe: chunk documents, embed each chunk, store vectors, retrieve the top-k most similar at query time. pgvector holds up to ~1M vectors before search degrades. At 10M documents, teams need a dedicated vector store. Qdrant, written in Rust, hits that scale without configuration gymnastics. Understanding its three primitives - point, vector, payload - means understanding every RAG pipeline running in production today.

  • **RAG pipeline**: every document chunk gets embedded and stored as a Point. Query embedding fires nearest-neighbor search, top chunks become context for the LLM
  • **Recommendation**: a liked item's vector drives a nearest-neighbor query - similar vectors surface similar products or films
  • **Code search at GitHub scale**: each function gets a code embedding stored as a Point. 'Find a sorting function' becomes a semantic query, not a grep
  • **E-commerce visual search**: product image embedding stored as a Point; 'find similar items' queries the visual embedding space

From word2vec to production vector stores

In 2013 Tomas Mikolov at Google published word2vec - a neural technique that mapped words to fixed-size float vectors so that semantically similar words clustered nearby in vector space. 'king' minus 'man' plus 'woman' landed near 'queen'. That single property - semantic proximity encoded as geometric proximity - planted the seed for every modern embedding model. The 2017 Vaswani et al. transformer paper scaled the idea to models capable of embedding entire documents. By 2021, Approximate Nearest Neighbor (ANN) search became a first-class infrastructure problem: teams needed to query billions of vectors in milliseconds. Qdrant launched that year as an open-source Rust engine purpose-built for filtered vector search - HNSW index plus schema-free payload filtering in one system. By 2023 it competed with Pinecone, Weaviate, and Chroma at the center of the RAG boom.

Предварительные знания

  • Collections: Creation and Configuration

Point structure: ID + vector + payload

**A Point** is the fundamental unit of data in Qdrant. Every point packs three things: a unique identifier, a vector (the embedding), and an optional payload (arbitrary JSON metadata).

**ID: integer vs UUID.** Integer IDs are simpler and slightly faster (no string hashing). UUIDs are convenient when identifiers already exist in another database. The constraint: **ID must be unique within a collection**.

**Payload is schema-free JSON.** No fixed schema. Different points can carry entirely different fields. Supported value types: strings, numbers, booleans, arrays, nested objects, `null`. For efficient filtering on a field, add a payload index (lesson 8).

Point 1 has payload `{category: 'science'}`, point 2 has `{topic: 'physics', year: 2024}`. Is this valid in Qdrant?

Upsert: adding and updating points

**Upsert** (update + insert) is the primary write operation in Qdrant. If a point with that ID already exists - it gets fully replaced. If not - it's created. No separate INSERT and UPDATE paths.

**`wait: true`** waits for confirmation that data landed in the WAL. Defaults to `false` (fire-and-forget). Use `wait: true` in production so the response confirms the data was accepted.

**Generating IDs:** if there's no natural integer ID, generate a UUID: `crypto.randomUUID()` in Node.js 18+. UUID v4 is the standard choice. Qdrant accepts both formats.

Upsert was called with a point id=42 that already exists. What happens?

Payload operations

**Payload can be updated independently of the vector.** This matters: regenerating an embedding costs real money and time, but metadata changes constantly. Qdrant supports surgical payload updates.

**setPayload vs overwritePayload:** `setPayload` is like `Object.assign()` - adds/updates fields, leaving the rest intact. `overwritePayload` is like `obj = {}` - clears first, then sets. Reach for `setPayload` in 90% of cases.

A point has payload: `{title: 'Old', views: 100}`. `setPayload({views: 200, tags: ['ai']})` is called. What's in the payload now?

CRUD operations on points

**Retrieving, updating, and deleting** individual points covers the full CRUD surface for managing Qdrant data.

**`with_vector: false`** - always set this when the vector isn't needed. Each vector takes ~6KB (1536 × 4 bytes). Fetching 1000 points with vectors included adds 6MB of payload to the response for no reason.

Storing full document text in Qdrant payload

Store only metadata and a short preview/chunk in payload. Full text belongs in the primary database (PostgreSQL), referenced by ID

Payload is loaded on every search result. 10KB of text per point at 100 RPS means 1MB per request just for payload. Qdrant isn't optimized for large text storage - that's PostgreSQL's or S3's job

All 10M points in a collection need to be iterated for an export. Which method fits?

Key Ideas

  • **Point = ID (int/UUID) + vector (float[]) + payload (JSON).** Payload is schema-free - different points can carry different fields
  • **Upsert** is idempotent: creates or fully replaces a point. Use `wait: true` to confirm the write landed
  • **Batch upsert:** 64-256 points per request, 4 parallel workers - optimal loading throughput
  • **`setPayload`** merges fields (like Object.assign), **`overwritePayload`** replaces everything. Use setPayload in 90% of cases
  • **`with_vector: false`** on retrieve/scroll saves ~6KB per point. **scroll()** is the correct tool for full collection traversal

What's next

Data is loaded. Time to search.

  • First Search — search API, distance metrics, score threshold - how to run queries
  • Payload Indexes and Filtering — How to filter search results by payload fields without degrading recall

Вопросы для размышления

  • Should full document text live in Qdrant payload? What are the tradeoffs of storing it elsewhere?
  • With 1M documents and a new 'language' field to add to all points - how is it done without re-generating embeddings?
  • How would multi-chunk indexing work: one large document split into many chunks, each stored as a separate Point?

Связанные уроки

  • qd-03-collections — Collections define dimensionality and distance metric - must exist before inserting points
  • qd-05-first-search — Points stored here become the corpus for nearest-neighbor search
  • qd-06-hnsw — HNSW index is built on top of the vectors stored as points
  • qd-08-payload-index — Payload fields defined here are the ones that get indexed for fast filtering
  • qd-07-distance-metrics — The vector dimensionality and type constrain which distance metrics are valid
  • la-01-vectors-intro
Points, Vectors, Payloads

0

1

Sign In