3

[Tutorial] Solving Interval Problems with Geometry

 1 year ago
source link: http://codeforces.com/blog/entry/98629
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

[Tutorial] Solving Interval Problems with Geometry

By Monogon, history, 8 months ago,

There are a lot of programming problems where a collection of intervals comes up, either directly or indirectly. In this tutorial, I want to present a nice way to visualize a collection of intervals, and how you can solve problems with it.

An interval [l,r][l,r] is just a pair of integers ll and rr such that l≤rl≤r. Let's make a 2D plot where the horizontal axis is ll and the vertical axis is rr. Then our interval will be represented by the 2D point (l,r)(l,r). But an interval is more than just a pair of numbers: it describes the set of points xx such that l≤x≤rl≤x≤r.

How can we visually tell whether a number xx is covered by an interval? Well, we can represent it by a point (x,x)(x,x) in the plane. Then we can imagine that the interval's 2D point can see down and right, and the interval covers all the points it can see.

Here's an example where we see how the interval [2,4][2,4] covers the points 22, 33, and 44.

60b8b63a2b9682f2f347d66b3ab15b4523f21dcb.png

Great, we have a visualization for how an interval covers points. But we also want to talk about how intervals relate to each other. For example, intervals can be nested, or they can be disjoint, or something else.

Let's classify the 66 possible ways two intervals can relate to each other.

  1. An interval [l1,r1][l1,r1]covers an interval [l2,r2][l2,r2] if l1≤l2≤r2≤r1l1≤l2≤r2≤r1.

  2. An interval [l1,r1][l1,r1] is covered by an interval [l2,r2][l2,r2] if l2≤l1≤r1≤r2l2≤l1≤r1≤r2.

  3. An interval [l1,r1][l1,r1] is strictly left of an interval [l2,r2][l2,r2] if l1≤r1<l2≤r2l1≤r1<l2≤r2.

  4. An interval [l1,r1][l1,r1] is strictly right of an interval [l2,r2][l2,r2] if l2≤r2<l1≤r1l2≤r2<l1≤r1.

  5. An interval [l1,r1][l1,r1]intersects left of an interval [l2,r2][l2,r2] if l1≤l2≤r1≤r2l1≤l2≤r1≤r2.

  6. An interval [l1,r1][l1,r1]intersects right of an interval [l2,r2][l2,r2] if l2≤l1≤r2≤r1l2≤l1≤r2≤r1.

Note that there is some symmetry. For example, if one interval covers another interval, then the second interval is covered by the first interval. So, relations 11 and 22 are opposites, relations 33 and 44 are opposites, and relations 55 and 66 are opposites. Now, if we fix an interval (l1,r1)(l1,r1) we can make 66 different regions in the plane: one for each relation. I colored opposite relations the same color.

f0bc055b37bc44373c14a2fd0b8c09492d63c46a.png

Next, we're going to look at some example problems.

Problem 1. Nested Ranges Check

https://cses.fi/problemset/task/2168

Given nn ranges, your task is to determine for each range if it contains some other range and if some other range contains it.

There are two parts to this problem, so let's focus on one part for now. For each interval (l,r)(l,r), we want to check if it's covered by another interval. In other words, we want to check if its region 22 is nonempty.

For an interval to be in region 22, it must be left and up from it. Let's process all intervals in increasing order of ll, so that the "left of" condition just becomes "previously processed". So, we just need to ask for each interval, if there exists a previously processed interval above it.

If any previously processed interval is above (l,r)(l,r), then definitely the one with maximum rr will be above it. So we don't need to check all the previous intervals by brute force: just remember the maximum rr value seen so far and remember it.

There's one small issue we skipped over: when we sorted all intervals by ll, we didn't say what to do in the case of ties. In other words, if two intervals have the same ll value, should we process the lower or higher one first? Well, if there are two intervals with the same ll value, then the larger one is considered to cover the second. This means the interval with smaller rr should definitely be processed after the one with larger rr. So in the case of ties for ll value, we should process them in decreasing order of rr.

Now we've solved one of the two parts. For the other part, we want to check for each interval, if it covers another interval. We can actually use the exact same idea, except we will sort decreasing by ll, break ties in increasing order of rr, and remember the minimum rr value of all processed intervals.

The total complexity will be O(nlogn)O(nlog⁡n) from sorting.

Problem 2. Nested Ranges Count

https://cses.fi/problemset/task/2169

Given nn ranges, your task is to count for each range how many other ranges it contains and how many other ranges contain it.

This is very similar to the previous problem, but this time we want to count ranges instead of just checking. Just like last time, let's start with only focusing on counting for each interval, how many other intervals contain it. Also like last time, let's process all intervals in increasing order of ll, and break ties with decreasing order of rr.

Now when we process an interval (l,r)(l,r), we need to count how many previously processed intervals are above it. We should maintain some kind of data structure of the processed intervals, where we can insert a value and query how many are above a certain value.

If we do coordinate compression, we can make all the intervals have values between 11 and 2n2n without changing any of the relationships between them. Now we can store the processed rr values in a Fenwick tree, for example. To insert a new rr value, we update the Fenwick tree to increment position rr. To answer how many values are above rr, we query the range [r,2n][r,2n].

Again, the other half of this problem is the same except for the sorting order and we want to count the number of rr values below instead of above.

The total complexity will be O(nlogn)O(nlog⁡n), due to sorting and answering queries.

Problem 3. Largest Nested Interval Subset

Given nn intervals, find the size of the largest possible subset of intervals with the following property: for any pair of intervals, one contains the other.

First, we can describe the "nested subset" property in a nicer way: If the intervals are sorted decreasing by length r−lr−l, then the first interval contains the second interval, and the second interval contains the third interval, and so on. Geometrically, this means we're looking for a path of points that moves only down and right. Now we can see that this is basically the same as the longest decreasing subsequence problem.

We can solve this problem in a similar way as before: sort all intervals in increasing order of ll and break ties with decreasing order of rr. This time, we're going to use dynamic programming. For an interval [li,ri][li,ri] we'll compute dpidpi, which we define to be the largest nested subset of intervals such that [li,ri][li,ri] is the smallest interval. The final answer will be the maximum dpidpi value.

To compute dpidpi, we have the formula

dpi=1+maxj:ri≤rj(dpj),dpi=1+maxj:ri≤rj(dpj),

where we assume jj is only allowed to be a previously processed interval. If there exist no valid jj then we set dpi=1dpi=1. Just like the Nested Ranges Count problem, we can maintain a data structure that can support point updates (to insert a new dpidpi value) and range max queries (to find the maximum dpjdpj value corresponding to rj≥rirj≥ri). For example, a segment tree that supports range maximum queries will work here.

The total complexity will be O(nlogn)O(nlog⁡n) due to sorting and segment tree queries.

Problem 4. Interesting Sections

1609F - Interesting Sections

There is an array of non-negative integers a1,a2,…,ana1,a2,…,an. Count the number of segments [l,r][l,r] such that: 1. The minimum and maximum numbers are found on the segment of the array starting at l and ending at r. 2. the binary representation of the minimum and maximum numbers have the same number of bits equal to 1.

First, for each index, we want to know the largest interval on which it's the minimum element, and the largest interval on which it's the maximum element. We can precompute all these intervals in linear time with a monotonic stack idea, for example. Let's say the interval where aiai the minimum element is [l1(i),r1(i)][l1(i),r1(i)] and the interval where it's the maximum element is [l2(i),r2(i)][l2(i),r2(i)]. If there's a tie for the minimum (maximum) element in a range, we always pick the first occurrence to be the minimum (maximum).

Now, for each index ii, let's consider the set of intervals [l,r][l,r] on which aiai is the minimum element. We simply require it to contain index ii, and for it to be contained in [l1(i),r1(i)][l1(i),r1(i)]. In the plane, this is just a rectangle: l1(i)≤l≤il1(i)≤l≤i and i≤r≤r1(i).i≤r≤r1(i). Similarly, we have a rectangle for the maximum case.

Let's say that a rectangle is red if it came from the minimum case, and blue if it came from the maximum case. First, notice that no two rectangles of the same color intersect because a range has a unique minimum and maximum by our tie-breaking rule. Our task is to calculate how many lattice points are the intersection of a red and blue rectangle, and both of them have the same "group number", namely the popcount of their corresponding aiai values.

Let's just offset rectangles according to their group number so that rectangles of different group numbers definitely don't intersect, and rectangles of the same group are offset by the same amount.

The only thing we need to compute is the area of intersection of red rectangles and blue rectangles. We can find the area of intersection of these rectangles in O(nlogn)O(nlog⁡n) time with a sweepline and segment tree. The rectangle union problem is similar, which can be found on CSES: https://cses.fi/problemset/task/1741

For the rectangle intersection problem, we need our segment tree to maintain two things: 1. maximum value on a range, and 2. number of occurrences of the maximum element. It needs to support range increment/decrement updates and range queries. When the sweepline enters a rectangle, we increment its range, and when it exits a rectangle, we decrement its range. A value 22 in the segment tree means there are both red and blue rectangles covering it, so the number of occurrences of 22 describes how much intersection there is between red and blue at the sweepline's current position.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK