Qdrant - Vector Database
ACORN, Inline Storage, and Gridstore
Qdrant v1.16-17 is not just new API features. These are fundamental changes to how data is stored and how the graph is traversed. Understanding these changes lets you squeeze maximum performance out of your hardware.
- Multi-tenant RAG: 10M documents, 100K users - ACORN keeps recall on target
- On-disk collection on NVMe: Inline Storage cuts disk reads by 16x
- Real-time indexing: Gridstore eliminates latency spikes from RocksDB compaction
Предварительные знания
ACORN: Restoring Recall Under Restrictive Filters
The classic HNSW-with-filters problem: when a filter is very restrictive (e.g., `user_id = 'alice'`), most graph neighbors are filtered out. HNSW starts to stall - there are no transitions to the needed points, and recall drops sharply to 60-70%.
**ACORN** (Approximate Closest-to-Object Reachability Navigation), added in Qdrant v1.17, solves this problem. The idea: when a direct neighbor in the graph is filtered out, ACORN checks **2-hop neighbors** - the neighbors of neighbors. This significantly expands the set of reachable points using the same HNSW graph structure.
| Scenario | Without ACORN | With ACORN | Overhead |
|---|---|---|---|
| No filter | ~99% recall | ~99% recall | 0% |
| Soft filter (50% of points) | ~95% recall | ~98% recall | +10-20% |
| Tight filter (1% of points) | ~60% recall | ~95% recall | +2-10x |
| Very tight filter (0.1% of points) | ~30% recall | ~90% recall | +5-20x |
ACORN is significantly slower under very restrictive filters - 5-20x compared to unfiltered search. This is a recall vs latency trade-off. For critically latency-sensitive cases, consider a **tenant-based** architecture with separate collections per user.
How does ACORN restore recall under restrictive filters in HNSW?
Inline Storage: Vectors Inside Graph Nodes
Before Qdrant v1.16, searching an on-disk collection required **two operations** per HNSW node visit: 1. Reading the graph structure (neighbor links) - one random I/O 2. Reading the point's vector from a separate file - another random I/O For a deep HNSW search (ef=128), this means ~256 random disk reads. On an HDD - catastrophic. On NVMe SSD - still slow.
**Inline Storage** (v1.16+): quantized vectors are stored directly inside HNSW graph nodes, in the same data block. Reading a node = reading both the graph structure and the vector in one operation. For memmap (on-disk) search: was 32 random reads → now 2 page reads.
**When Inline Storage gives maximum benefit:** - on_disk collections (vectors on disk + graph on disk) - Collections larger than server RAM - NVMe SSD (not HDD - latency is too high there even with Inline) - Quantization enabled (int8 or binary)
In which scenario does Inline Storage provide the greatest performance improvement?
Gridstore: Replacing RocksDB for Payload and Sparse
Before v1.16, Qdrant used **RocksDB** for payload and sparse vector storage. RocksDB is an LSM-tree: excellent for read-heavy workloads, but it has write amplification (each byte is written multiple times during compaction) and unpredictable latency spikes during compaction.
**Gridstore** (v1.16+) is a custom storage engine from the Qdrant team, replacing RocksDB. Key benefits: - **Less write amplification**: no LSM compaction - **Stable latency**: no latency spikes from background tasks - **Better scan**: sequential access during scroll/filter is faster - **Less CPU**: no background compaction threads
**When to use all three together** (ACORN + Inline Storage + Gridstore): This is the recommended configuration for **production large-scale deployment** on Qdrant v1.17+: - Large collection (>10M points), does not fit in RAM - Multi-tenant architecture with per-user tenant filters - Write-heavy: continuous upsert of new documents - Recall requirements ≥95% even under restrictive filters
What main problem with RocksDB does Gridstore solve?
Summary
- ACORN (v1.17+): 2-hop HNSW navigation restores recall from ~60% to ~95% under restrictive filters
- Inline Storage (v1.16+): quantized vector stored inside the graph node - disk reads during on-disk search drop by 16x
- Gridstore (v1.16+): RocksDB replacement without LSM compaction - stable latency, less write amplification
- Higher m in HNSW (32 instead of 16) improves ACORN - more 2-hop paths through the graph
- All three together - the recommended configuration for large-scale on-disk multi-tenant collections
What's Next
You have completed the full advanced Qdrant course. Here are directions for further study.
- Performance Tuning — Practical parameters for tuning Qdrant under load
- Quantization — Deep understanding of quantization - the foundation for Inline Storage
- HNSW — HNSW algorithm details - the foundation for understanding ACORN
Вопросы для размышления
- At what collection size would you enable on_disk for vectors? What memmap_threshold would you choose?
- How would you measure the actual performance gain from Inline Storage in your system? What metrics would you watch?
- If your service grew rapidly from 1M to 50M documents - in what order would you apply ACORN, Inline Storage, and Gridstore?