Qdrant - Vector Database
Replication and Fault Tolerance
A node goes down at 3 AM. Are your users seeing errors or not? The answer is not determined by how fast the on-call engineer responds - it's determined by how replication_factor was configured when the collection was created. In distributed systems, fault tolerance is an architectural decision made upfront.
- **Production readiness:** replication_factor: 2, write_consistency: 1 - the baseline. Survives simultaneous failure of 1 node without losing availability
- **E-commerce with transactions:** write_consistency: quorum guarantees that a cart item won't disappear after a page refresh
- **Multi-region HA:** replication_factor: 3 across 3 data centers - survives a full data center outage
Предварительные знания
Replication Factor: Shard Copies Across Multiple Nodes
**Replication** in Qdrant means copying each shard to multiple nodes. `replication_factor: 2` means every shard exists on 2 nodes. If one node goes down, the data remains available on the other.
**Scenarios with different replication_factor values:**
| replication_factor | Nodes in cluster | Survives failure of | Write overhead |
|---|---|---|---|
| 1 | 1+ | 0 nodes (no fault tolerance) | Minimal |
| 2 | 2+ | 1 of 2 nodes | ×2 on writes |
| 3 | 3+ | 1 of 3 nodes (or 2 with strong consistency) | ×3 on writes |
| 3 + write_consistency: 2 | 3+ | 1 node, writes confirmed by 2 replicas | ×3 writes, quorum |
Cluster: 3 nodes, shard_number: 3, replication_factor: 2. One node (node2) fails. How many shards now have only 1 replica?
Raft: Consensus for Cluster Metadata
**Raft** is the consensus algorithm Qdrant uses to synchronize cluster metadata: node list, collections, shard configuration. Raft guarantees all nodes share the same view of cluster state.
**Quorum and availability:** when quorum is lost (> N/2 nodes fail), Qdrant cannot change metadata (create collections, modify config). However **data reads and writes** continue working on the remaining nodes - depending on `write_consistency_factor`.
**Node count and Raft fault tolerance:** 2 nodes - quorum is 2, if 1 fails - no leader (no schema changes). 3 nodes - tolerates 1 node failure. 5 nodes - tolerates 2 node failures. An odd number of nodes is always better than even for Raft: 4 nodes tolerate only 1 failure (same as 3 nodes) - why pay for the 4th?
A 4-node cluster. 2 nodes fail simultaneously. What happens to Qdrant?
Consistency vs Availability: CAP in Qdrant
**The CAP theorem** applies to distributed Qdrant: you cannot simultaneously guarantee both Consistency and Availability during a network partition. Qdrant gives you control through `write_consistency_factor` and `read_consistency`.
| Scenario | write_consistency_factor | read consistency | Trade-off |
|---|---|---|---|
| Similar image cache | 1 | weak | Maximum speed, minor inconsistencies possible |
| Document search engine | 1 | medium | Good balance for most use cases |
| Read-after-write guarantee | 2 | strong | Higher latency, no stale reads |
| Financial / medical data | quorum | quorum | Maximum consistency, reduced availability |
**Data loss scenario:** replication_factor: 2, write_consistency_factor: 1. Write confirmed by replica-A. Replica-B has not yet received the data. Replica-A crashes. Data is lost - replica-B doesn't know about this write. If this is unacceptable - use write_consistency_factor: 2 (or 'quorum').
"replication_factor protects against data loss"
replication_factor protects against availability loss (node down - others respond). But with write_consistency_factor: 1 data loss is possible if a node fails BEFORE replication completes. For data durability, use write_consistency_factor: quorum.
Replication and durability are different guarantees. Replication: 'data is on multiple nodes'. Durability: 'data is reliably saved'. With write_consistency: 1, a write can be confirmed before propagating to all replicas - this is asynchronous replication.
A user adds a document and then immediately searches. With write_consistency: 1 and read consistency: 'weak' - the user might not find the just-added document. What is the least costly fix?
Key Takeaways
- **replication_factor** - how many copies of each shard. 2+ = fault tolerance. Set at collection creation time
- **Raft** manages cluster metadata via consensus. Odd node counts (3, 5) are better than even
- **write_consistency_factor:** 1 = high availability, quorum = data durability. Choose based on requirements
- **read consistency:** weak (fast) / strong (freshness guarantee). Can be set per request
- **Data loss** is possible with write_consistency: 1 + node fails before replication. Use quorum for critical data
What's Next
Replication is configured - now you need visibility into what's happening inside the cluster. Metrics and alerts.
- Monitoring and Prometheus — Track replica and shard health through metrics
- Distributed Mode — Sharding is the foundation on top of which replication works
- Performance Tuning — Replication affects write latency - account for this during optimization
Вопросы для размышления
- Explain the difference between replication_factor and write_consistency_factor. Can write_consistency_factor exceed replication_factor? What happens if you try?
- In CAP theorem terms, is Qdrant with write_consistency: quorum a CP or AP system? Justify your answer. How does it change with write_consistency: 1?
- How do you verify replication is working correctly? Describe a test: what to check, how to simulate a node failure, how to verify data integrity after recovery.