Databases
System Design: Uber Driver Location
When a passenger opens Uber in New York City at rush hour, the system must find the nearest available driver in under 100ms. With 5 million drivers globally sending GPS updates every 4 seconds, that is 1.25 million location writes per second. A naive SQL query scanning all 5M rows per request would require 14.5 billion row evaluations per second at peak load. Geospatial indexing with H3 hexagonal cells reduces this to 19 Redis key lookups.
- **Uber H3**: open-sourced in 2018. Used for driver matching (resolution 7, ~1.2km cells), surge pricing zones, and analytics. Adopted by DoorDash, Lyft, and mapping companies worldwide.
- **Lyft**: Redis GEORADIUS for driver search. Similar TTL-based liveness detection (30s key expiry). The matching system handles thousands of concurrent requests per second during peak hours.
- **DoorDash**: H3-based delivery zone management. Hexagonal cells enable uniform delivery radius estimation and prevent the corner-distance distortion that square grids produce.
Requirements and Scale
Uber's core problem: a rider requests a trip, and the system must find the nearest available driver within seconds. This requires storing and querying the real-time location of millions of drivers globally, updating every few seconds. The primary constraints are write throughput (location updates), read latency (nearest driver query), and geographic query correctness.
Uber's engineering team described their location tracking as one of their most challenging systems to scale. At peak times in dense cities like New York, a single Uber cell processes tens of thousands of location updates per second while simultaneously running matching queries for thousands of concurrent ride requests.
Uber receives 5M driver GPS updates every 4 seconds = 1.25M writes/sec. A naive SQL query scans all rows per match request. What makes this unscalable?
Geospatial Indexes
Geographic queries (find all points within radius R of point P) require spatial indexes that partition space, not just sort values. A B-Tree on latitude alone cannot answer 'all points within 1km' efficiently - the latitude range covers a strip around the globe, not a circle. Geospatial indexes (R-Tree, Quadtree, geohash) partition 2D space to enable efficient circular and rectangular region queries.
Redis GEO uses geohash encoded as a 52-bit integer stored as a sorted set score. GEORADIUS queries decompose the search radius into 9 geohash cells (center + 8 neighbors) and perform range queries on the sorted set. This is O(N + log(M)) where N is results and M is elements in the search area.
Why is a standard B-Tree index on (lat, lng) inefficient for finding the nearest drivers within 1km?
H3 Hexagonal Grid
H3 is Uber's open-source hexagonal hierarchical geospatial indexing system. It divides the globe into hexagonal cells at multiple resolutions. Hexagons are preferred over squares because all neighbors of a hexagon are equidistant from its center, eliminating the distance distortion that squares introduce at corners.
Uber open-sourced H3 in 2018. It is now used by Uber for driver matching, surge pricing zone computation, and analytics. DoorDash, Lyft, and many mapping companies have adopted H3 for geospatial operations. The hexagonal grid enables O(1) cell lookups instead of spatial range queries.
Why does H3 use hexagons instead of squares?
Location Storage and Expiry
Driver locations are stored in Redis with a TTL to automatically expire stale data. An offline or non-moving driver's key expires after 30-60 seconds, removing them from the available pool without explicit deletion. This TTL-based liveness detection is simpler and more reliable than explicit heartbeat management.
Lyft uses a similar architecture with Redis GEORADIUS commands. The key insight from both Uber and Lyft engineering is that TTL-based expiry is more reliable than explicit 'driver offline' events - network failures or app crashes never require cleanup code to run. The location simply expires.
Why set a 30-second TTL on Redis driver location keys?
Key Ideas
- **Scale**: 5M drivers x 1/4s update = 1.25M writes/sec. Full table scan per match query is infeasible. Geospatial indexing is essential.
- **B-Tree on lat/lng** fails for radius queries: a latitude range covers a global band, not a local circle. Spatial indexes (geohash, R-Tree, H3) partition 2D space correctly.
- **H3 hexagons**: equidistant neighbors (vs square corners 41% farther). Resolution 7 (~1.2km cells) for matching; resolution 9 (~170m) for ETAs.
- **Redis per-driver key with TTL**: SET driver:id:location ... EX 30. Offline drivers expire automatically without explicit deletion or heartbeat management.
- **Matching query**: rider cell -> H3 grid_disk(k=2) = 19 cells -> SMEMBERS per cell -> rank by haversine distance -> top N drivers.
Related Topics
Uber's location system applies several data patterns:
- Redis Data Structures — Redis GEO commands (GEOADD, GEORADIUS) and sorted sets with geohash scores are the primary storage layer for driver locations.
- Cassandra Data Modeling — Trip history and location event logs are stored in Cassandra (append-only, time-sorted). Location updates are not stored in Cassandra due to its write-then-read model.
- Sharding — Location data is sharded by geographic region (city or H3 super-cell). Drivers in New York are handled by different shards than drivers in London.
Вопросы для размышления
- At H3 resolution 7 (~1.2km cells), a rider in the center of a cell and a driver at the cell's edge are up to 600m apart. How does increasing k in grid_disk(k) affect the trade-off between match quality and query cost?
- Uber's surge pricing divides a city into H3 cells and prices each based on supply/demand ratio. How would you design the data pipeline to compute surge multipliers in real time as driver locations change?
- Redis TTL automatically removes offline drivers. What happens if a driver has poor connectivity and misses one 4-second update interval but returns after 25 seconds? How does the system handle this?