Qdrant - Vector Database

Universal Query API

Three APIs instead of one means three times more code, tests, and potential inconsistencies. Universal Query unifies everything and adds capabilities that simply didn't exist before.

Hybrid search (dense + sparse) in one network round-trip instead of two
A/B testing different search strategies without changing client code
Gradual migration from simple search to a multi-stage RAG pipeline

Предварительные знания

Universal Query: One Endpoint Instead of Three

Before Qdrant v1.10, three separate endpoints existed: `/points/search` (vector search), `/points/recommend` (recommendations), and `/points/discover` (discovery). Each accepted different parameters and returned slightly different response formats. In v1.10, the **Universal Query API** arrived - a single `POST /collections/{name}/points/query` that handles everything.

Universal Query is not just syntactic sugar - it unlocks capabilities that were previously impossible: **prefetch** (multi-stage retrieval in one request) and **fusion** of multiple search strategies without extra round-trips.

The key parameter is `query`. Its type determines the operating mode: - `float[]` → nearest-neighbor search (ANN) - `string` (point UUID) → recommendation: averages all vectors of that point - `{positive: [...], negative: [...]}` → recommendation/discovery with examples - `{fusion: "rrf"}` → merge results from prefetch stages - `null` / omitted → scroll (returns points without ranking)

What happens when a string (UUID) is passed as the `query` in Universal Query?

Universal Query Modes: From Search to Discovery

Universal Query supports four different modes at the core through a single API. Let's walk through each with a practical example.

Mode	Query type	When to use
ANN Search	float[]	You have a query vector (from an embedder)
Recommend by ID	string (UUID)	"More like this" based on an existing item
Recommend +/-	{recommend: {positive, negative}}	Personalization with like/dislike history
Discovery	{discover: {target, context}}	Exploration: similar to A, not like B

The old `/search`, `/recommend`, and `/discover` endpoints still work in v1.10+ for backward compatibility, but new code should use Universal Query - it receives active development and new features first.

Universal Query is just a new name for /points/search

Universal Query unifies search, recommend, and discover, and adds the new prefetch capability for multi-stage retrieval

Prefetch lets you execute multiple searches in a first stage and combine their results (fusion) in a single round-trip. This was impossible with the previous individual endpoints.

What `query` type should you use to find products "similar to A but not like B" using existing points in the collection?

Prefetch: Multi-Stage Retrieval in One Round-Trip

**Prefetch** is the cornerstone feature of Universal Query. It lets you describe a multi-stage retrieval pipeline as a single request. Classic hybrid search (dense + sparse) previously required two separate requests and manual result merging on the client. With prefetch, this happens inside Qdrant in a single round-trip.

Prefetch supports **up to 2 levels of nesting**. This enables three-stage pipelines: a coarse search over quantized vectors followed by re-ranking over full float32 vectors.

Performance: prefetch stages execute in parallel inside Qdrant. Two concurrent prefetch queries take the same time as one. This makes hybrid search with prefetch significantly faster than sequential client-side requests.

What is `{fusion: 'rrf'}` in the `query` field of Universal Query?

Summary

Universal Query (`POST /points/query`) is a single endpoint replacing search/recommend/discover
The type of the `query` field determines the mode: float[] = ANN, UUID = recommend, object = recommend/discover/fusion
Prefetch describes multi-stage retrieval in one request with parallel execution inside Qdrant
Fusion (`rrf`, `dbsf`) merges results from multiple prefetch stages without any client-side code
Nested prefetch (up to 2 levels) enables three-stage re-ranking pipelines

What's Next

Universal Query is the foundation for more advanced search patterns and exploration strategies.

FastEmbed: Embeddings Without a GPU — Generating dense and sparse vectors for prefetch queries
Discovery Search — Deep dive into the discovery mode of Universal Query
Hybrid Search — Prerequisites: understanding dense + sparse for prefetch

Вопросы для размышления

When is prefetch better than two sequential client-side requests? Is it always the case?
Which fusion strategy (rrf or dbsf) would you choose for hybrid documentation search, and why?
How would you design an A/B test between ANN search and hybrid search using Universal Query?

Связанные уроки

alg-10-binary-search