Qdrant - Vector Database

Named Vectors: Multiple Embeddings

A product has a photo and a description. A document has a title and body text. Why store them separately? Named vectors keep everything in one point and let you search by any modality - no data duplication.

  • **Visual search in e-commerce:** CLIP embeddings for images + text embeddings for descriptions, search by photo or text
  • **Multilingual documents:** one vector per language, search in any language without translation
  • **Media archives:** transcript (Whisper) + audio features + video frames - three vectors, one point

Предварительные знания

  • Collections: Creation and Configuration

Why multiple vectors per point

**Named vectors** allow storing multiple vectors in one point. Each vector represents a different modality or model: text description, image, audio, short title, long body - all in a single data point.

**Example: e-commerce product.** A product has a name, description, and photo. Without named vectors, three separate collections or three separate points are needed. With named vectors - one point with three vectors.

ScenarioVectorsDescription
Multimodal searchtext + imageSearch by text OR by image in one collection
Multilingual searchen + ru + deEach language gets its own embedding model
Layered texttitle + bodytitle: fast small model, body: accurate large model
Audio-video archivetranscript + audio + framesDifferent models for different modalities

You have news articles. You want to search both by headline (fast, precise) and by full text (semantic). How should you organize the data?

Setting up a collection with named vectors

**Creating a collection with named vectors** - instead of a single vectors object, pass a dictionary where keys are vector names.

**Important:** each named vector has an independent HNSW index. hnsw_config can be configured separately for each - for example, a more precise index for text, a lighter one for the auxiliary image.

**A new named vector cannot be added to an existing collection** without recreating it. Design the schema upfront. If a vector needs to be added later - create a new collection and migrate the data.

You created a collection with named vector 'text'. Now you want to add an 'image' vector. What do you do?

Upserting with multiple vectors

**Upsert with named vectors** - pass an object instead of an array in the `vector` field. The object keys are vector names.

**Partial upsert:** a single vector can be updated without touching the others. Useful when a product image changes but the text description hasn't. Simply omit the unchanged vectors.

You have 10,000 products. Text embedding takes 0.1s, image embedding takes 0.5s. How do you minimize indexing time?

Searching named vectors and cross-modal search

**Searching a specific named vector** - specify the `using` parameter. This selects which vector space to search in.

**Cross-modal search** - search in one modality, retrieve results from another. For example: a user uploads a photo, we find products and return their text descriptions.

**CLIP for true cross-modal search:** CLIP models (ViT-B/32, ViT-L/14) project text and images into ONE shared space. This allows searching text queries against image vectors and vice versa. Regular text embeddings (OpenAI, BGE) are incompatible with image embeddings.

A user uploads a photo of a dog and wants to find similar animals in the database. Which named vector should you search?

Key Ideas

  • **Named vectors** - multiple vectors with different dimensions and metrics in one point
  • **Schema is fixed at creation** - a vector cannot be added after creating the collection
  • **Upsert with named vectors** - pass `{name1: vec1, name2: vec2}` in the vector field
  • **Search** - specify `using: 'vector-name'` to select the search space
  • **Cross-modal search** - only possible with compatible spaces (CLIP)
  • Remember the hook? One point = the entire product - no duplication, searchable by any modality

What's next

Named vectors are set up. Now - precise result filtering with the Filter API.

  • Filter API — must/should/must_not - filter by category, price, date
  • Hybrid Search — Combine named dense + sparse via prefetch and RRF
  • Result Grouping — Deduplication by document when searching over chunks

Вопросы для размышления

  • Which modalities in your project would benefit from named vectors?
  • How do you choose which named vector to use as the default for your primary search scenario?
  • What are the trade-offs of storing an image vector in memory vs on disk for a large collection?

Связанные уроки

  • la-02-dot-product
Named Vectors: Multiple Embeddings

0

1

Sign In