Computational Geometry
Segment Intersections
OpenStreetMap holds 800 million nodes and 70 million roads. Every routing query has to reconcile roads against rivers, zones, and administrative boundaries. Brute-force O(n^2) on 70 million segments is 5*10^15 operations - geological time. Sweep line cuts it to seconds.
- **GIS:** map layer overlay - finding where roads cross rivers, zone boundaries, or pipelines
- **Collision detection in games:** broad phase - sweep and prune for AABBs, identifies potential collisions among thousands of objects
- **CAD / PCB design:** checking for trace crossings on a printed circuit board - critical for manufacturing
Shamos, Hoey, Bentley, Ottmann - the sweep revolution
In 1976 Shamos and Hoey shipped the first sweep-line algorithm for detecting whether any intersection exists at all - O(n log n). Before that paper, O(n^2) felt like a law of nature. Three years later Bentley and Ottmann pushed the idea further, reporting ALL k intersections in O((n+k) log n). The 1979 paper became canon: every computational geometry textbook carries it, and the algorithm became the bedrock of CGAL - the industrial library powering Google Maps and AutoCAD.
Предварительные знания
The Sweep Line Paradigm
Picture a vertical ruler gliding left-to-right across the plane. At any instant it sees only the segments currently crossing it - segments pop in, segments drop out. That is the **sweep line**, one of the sharpest paradigms in computational geometry: a 2D problem collapses into a stream of 1D problems.
**Sweep Line** - a vertical line scanning the plane left-to-right. Two moving parts: the **Event Queue** (points where something changes) and the **Status Structure** (the ordered set of segments currently intersecting the line).
Three types of events in a sweep line algorithm for segments:
| Event type | When | Action |
|---|---|---|
| Left endpoint | Sweep line reaches the left end of a segment | Insert into status, check neighbors |
| Right endpoint | Sweep line reaches the right end | Remove from status, check new neighbors |
| Intersection | Two segments intersect | Swap order in status, check new neighbors |
The trick: **two segments can only intersect if at some moment they sit next to each other in the status**. Checking every pair is wasted work; checking neighbors on each status change catches every intersection. That single observation is what unlocks the speedup.
Why does the sweep line algorithm only check adjacent segments in the status structure?
The Bentley-Ottmann Algorithm
In 1979 Bentley and Ottmann turned the sweep-line idea into a concrete algorithm that reports **all** intersections among n segments. The brute-force approach grinds through O(n^2) pairs. Bentley-Ottmann runs in O((n + k) log n), where k counts the actual intersections. With n = 10,000 and k = 50: brute force burns 10^8 ops, B-O burns roughly 10^5 - a thousand-fold gap.
**Bentley-Ottmann** finds all k intersections among n segments in O((n + k) log n) time and O(n + k) space. The Event Queue is a heap; the Status Structure is a balanced BST. Each event takes O(log n) to process.
| Algorithm | Time | Space | When to use |
|---|---|---|---|
| Naive (all pairs) | O(n^2) | O(1) | n < 100 |
| Bentley-Ottmann | O((n+k) log n) | O(n+k) | General case |
| Balaban (1995) | O(n log n + k) | O(n+k) | Theoretically optimal |
| Shamos-Hoey | O(n log n) | O(n) | Only checking IF an intersection exists (yes/no) |
**Implementation challenges:** degenerate cases - three segments meeting at one point; vertical segments; intersections at endpoints. Production implementations (CGAL, Shapely) handle all of these.
For 1000 segments with 50 intersections, Bentley-Ottmann runs in:
Red-Blue Intersections
Split the segments into two camps - "red" and "blue". Only crossings **between** the camps count; crossings inside a single camp are noise. This is the **red-blue intersections** problem, and it shows up in map overlay, group-vs-group collision detection, and polygon operations across GIS layers.
**Red-Blue Intersections:** given two sets of segments R (red, n segments) and B (blue, m segments), find all intersections r ∩ b, where r ∈ R and b ∈ B. Intersections within R or within B are ignored.
Specialized algorithms exist for narrower red-blue cases. When one set is horizontal and the other vertical, the problem drops to O(n log n + k) via a **segment tree** or **interval tree**.
| Problem variant | Complexity | Method |
|---|---|---|
| General red-blue segments | O((n+m+k) log(n+m)) | Modified Bentley-Ottmann |
| Horizontal x vertical | O((n+m) log(n+m) + k) | Sweep line + segment tree |
| Red = segments, blue = points | O((n+m) log n) | Sweep + binary search |
| Rectangles x points | O((n+m) log n + k) | Sweep + interval tree |
In the red-blue intersections problem, why are intersections within the same color ignored?
Output-Sensitive Algorithms
Output size in computational geometry swings wildly. Among n segments there might be zero crossings, or there might be O(n^2) of them. Algorithms whose runtime scales with output size k are called **output-sensitive**. Burning O(n^2) work to report three intersections is malpractice.
An **output-sensitive algorithm** has complexity that depends not only on the input size n but also on the output size k. For segment intersection, the theoretical optimum is O(n log n + k), achieved by Balaban's algorithm (1995).
Worst case (k = O(n^2)), Bentley-Ottmann actually loses to brute force - the log factor sinks it. Balaban's algorithm escapes that trap with theoretically optimal O(n log n + k), but the implementation is brutal.
| Task | Best algorithm | Output-sensitive? |
|---|---|---|
| Does an intersection exist? (boolean) | O(n log n) Shamos-Hoey | No (k doesn't affect runtime) |
| Find all k intersections | O(n log n + k) Balaban | Yes |
| Count k (without listing) | O(n^(4/3) log n) | Partially |
| Convex hull (h vertices) | O(n log h) Chan | Yes |
Sweep line crushes 2D problems by replaying them as a stream of 1D events. Bentley-Ottmann is the workhorse implementation for segment intersection. The output-sensitivity principle keeps everyone honest: an algorithm shouldn't spend O(n^2) work on an O(1) answer.
The naive O(n^2) algorithm for checking all segment pairs is adequate for practical tasks
For thousands of segments (maps, CAD, collision detection) O(n^2) is unacceptable. Bentley-Ottmann at O((n+k) log n) is orders of magnitude faster when k is small. For a boolean check, Shamos-Hoey runs in O(n log n) with no dependence on k.
In real applications n can reach 10^5-10^6 (road networks, map polygons). O(n^2) = 10^10-10^12 - completely impractical. Sweep line reduces the problem to O(n log n + k), which in practice is often nearly linear.
For n = 10,000 segments with no intersections at all (k = 0), what is the optimal approach?
Key Takeaways
- **Sweep line** - a paradigm: a vertical line sweeps the plane, processing events (endpoints, intersections)
- **Bentley-Ottmann** O((n+k) log n): event queue + status BST. Only neighbors in the status are checked
- **Red-Blue Intersections:** find intersections only between two groups - modified sweep line
- **Output-sensitivity:** O(n log n + k) - an honest measure that does not penalize when there are few intersections
Related Topics
Sweep line is a universal paradigm that applies far beyond segment intersection:
- Points, Segments, Polygons — Fundamental primitives: cross product and orientation - the foundation for testing whether a pair of segments intersects
- Convex Hull — Graham Scan is also an ordered traversal. Output-sensitivity: Chan's O(n log h) vs Graham's O(n log n)
- Balanced Trees — The status structure is a BST. Sweep line efficiency depends on O(log n) BST operations
Вопросы для размышления
- The sweep line moves from left to right. What changes if all segments are vertical? How would the algorithm be adapted?
- Bentley-Ottmann is slower than the naive approach when k = O(n^2). Why is this rarely a problem in practice?
- How would sweep line be used to solve this problem: find all pairs of rectangles that overlap?
Связанные уроки
- cgeom-01 — Cross product and orientation - the basis for testing any segment pair
- cgeom-02 — Graham Scan is also a sweep idea; output-sensitivity: Chan O(n log h)
- alg-05 — Balanced BSTs are the status structure in Bentley-Ottmann
- cgeom-04 — Voronoi diagrams are built via Fortune's sweep line
- cgeom-05 — Delaunay triangulation also uses a sweep for construction
- alg-07 — Interval trees - an alternative to sweep for rectangle range queries
- la-04-lines-planes