Computational Geometry
Computational Geometry in Interviews
A Tesla self-driving engineer L5 interview gives one hour and three problems: build the convex hull of LiDAR scans, find the closest car in a point stream, design point-in-polygon for current-city detection. Knowing sweep line, hull algorithms, and R-tree is not academic decoration but a daily tool in robotics, GIS, and production computational geometry.
- **Tesla Autopilot** builds a convex hull of detected obstacles 30 times per second for the ESC control loop
- **Uber H3** is a geospatial index for driver-passenger matching; 1M qps for point-in-region
- **Google Maps** uses R-tree atop 100M+ administrative polygons; a 'which city' query resolves in milliseconds
Sweep Line in Interviews
A Yandex Maps interview: 'Given N segments in the plane, find every intersection in O((N+K) log N), where K is the number of intersections.' A candidate without **sweep line** background will write a naive O(N^2). A candidate who knows it will describe Bentley-Ottmann in 10 minutes: a vertical line moves left to right, a status structure holds active segments ordered by y, and events are starts, ends, and crossings.
Sweep line is a **universal template** for geometric problems: 'process events in coordinate order, maintain a structure of active objects.' It applies to segment intersection, closest pair, Voronoi diagram construction (Fortune), and polygon triangulation.
**Interview answer template**: (1) Recognize the 'all pairs' / 'process left to right' pattern. (2) Name the algorithm (Bentley-Ottmann, Fortune). (3) Describe events and status structure. (4) State complexity with justification.
Problem: 10^5 segments, decide whether at least one intersection exists. Which solution fits an interview?
Convex Hull
A Tesla self-driving engineer interview asks: 'Given a LiDAR point cloud from one frame, build the convex hull to define the minimum bounding region for obstacle avoidance.' **Convex hull** is one of the most common questions in robotics, ML (Support Vector Machines), and computer graphics (collision detection).
Two classic O(N log N) algorithms: **Graham scan** (sort points by angle relative to the bottom point, stack pass with turn-direction checks) and **Andrew's monotone chain** (sort by x, two-direction pass). For output-sensitive variants there is **Chan's algorithm** at O(N log h), where h is the number of hull points.
**Cross product = turn orientation**: the sign of `cross(O, A, B)` says whether the polyline O-A-B turns left (>0), right (<0), or is collinear (=0). The foundation of virtually every 2D algorithm.
Problem: compute the convex hull of 10^6 points 100 times per second (real-time for self-driving). Which interview approach?
Closest Pair of Points
A Facebook Friend Suggestions interview: 'Given N user locations, find the closest pair for a geo notification.' Naively O(N^2). The classic answer: **divide and conquer at O(N log N)**, described by Shamos and Hoey in 1975. Recalling that algorithm is a strong senior+ signal.
The algorithm: (1) Sort points by x. (2) Split into halves. (3) Recursively find the minimum in each half = `d`. (4) Check a strip of width `2d` around the dividing line. Key property: any point in the strip has at most 7 closest-neighbor candidates. (5) Combine results.
**KD-tree alternative**: for the dynamic case (points added/removed) a KD-tree gives O(log N) nearest-neighbor queries, beating divide-and-conquer recomputation. Choosing the structure to match the use case matters in interviews.
A ride-sharing system needs nearest-driver lookups on each passenger request. N=10^5, 1000 queries/sec. Which choice?
Geometric Algorithm Design
The final senior+ interview question: 'Design a system that checks whether a point lies inside a city polygon. N=10^5 polygons, 10^6 queries per second.' This is not an algorithmic puzzle (knowing is half), it is a **design** question: data structure choice, balancing preprocessing vs query, workload, SLO.
Approach: (1) Clarify SLO - latency? throughput? accuracy? (2) Estimate data characteristics - simple polygons vs complex? average vertex count? (3) Pick the index - R-tree for bbox prefilter, then point-in-polygon (ray casting, O(V) per polygon). (4) Discuss sharding for parallelism.
**Interview trap**: do not jump to an algorithm; start with bounds. 'Average polygon complexity? Are queries uniform? Strict accuracy or 99.9% enough?'
Geometric interview questions are about memorizing algorithms.
Geometric interviews test pattern recognition (sweep, hull, NN), structure choice for the workload, and trade-off discussion - not writing algorithms from memory.
An algorithm is googlable in 5 minutes. Designing the right solution architecture requires experience, which is exactly what the design question checks.
A candidate immediately writes ray-casting point-in-polygon without clarifying questions. What signal does this send to the interviewer?
Key Ideas
- **Sweep line** is the universal geometric template: events plus a status structure cover 80% of pair/intersection problems
- **Convex hull** in O(N log N) via Graham scan or Andrew's monotone chain; cross product underlies every orientation check
- **Closest pair** via divide-and-conquer at O(N log N); the key property is the 7-candidate strip of width 2d
- **KD-tree / R-tree** for dynamic NN queries: O(log N) per query instead of O(N log N) recomputation
- **Interview design = ask SLO/workload questions before picking the algorithm** - the main senior signal
Related Topics
Back to the Tesla interview: knowing sweep, hull, and NN algorithms is foundational, but senior level is about engineering instinct for choosing structures by workload. This connects to:
- Geospatial indexes (R-tree, geohash, H3) — In real systems geometric algorithms are wrapped in an index; without it, O(N log N) is useless at N=10^6
- Distributed storage — Geo indexes shard by region to scale; a critical decision in Uber/Google Maps-style systems
Вопросы для размышления
- Sweep line at O(N log N) is elegant on paper but cache-unfriendly due to random access in the status structure. When does naive O(N^2) win on real data thanks to locality?
- A 45-minute interview rarely leaves time to actually code sweep-line intersections. Which matters more: correct code or explaining architecture and trade-offs?
- ML approaches (DeepHull, neural sorting networks) yield approximate convex hulls faster than classical methods on GPUs. Does this change the must-know list for interviews in 3-5 years?
Связанные уроки
- cgeom-06 — Segment intersection and convex hull are top geometry interview problems
- cgeom-11 — Triangulation often appears as the final question in senior interviews
- alg-22-backtracking — Some geometric problems are solved via backtracking with geometric pruning
- alg-01-big-o — Geometric algorithms often have non-obvious O(n log n) complexities