Monday, March 31, 2014

Segment Trees and USACO March Contest #1 (Gold)

The problem can be found here:
http://usaco.org/index.php?page=viewproblem2&cpid=418

I was unable to solve this problem in the actual contest, but after spending some time learning segment trees, I was able to understand the solution. This problem in USACO illustrates the power of segment trees. If you are not already familiar with segment trees, there are many tutorials online.

Here is a detailed explanation of the solution.

The idea here is to use a line-sweep algorithm. However, there is a slight difficulty. We notice that the points reachable by Bessie within a maximum manhattan distance of K is actually a square rotated by 45°. We must transform the coordinates such that the sides of this square are parallel to the new x and y axes. This is achieved by the following transformation:
(x,y) ->  (x+y,y-x)
For example, (3,5) gets transformed to (8,2). 

It is important to note that now, the square is rotated 45° and also dilated by a factor of sqrt(2). Clearly, the sides of the square are now 2K units long.

Instead of thinking of this square as the points reachable by Bessie traveling a maximum manhattan distance of K, it is advantageous to think of this rather as a square (centered on a grass patch) that contains the points where Bessie can be such that she can still reach the grass patch. It is easy to see that this square has the same dimensions.

For example, say the ith grass patch is located at (3,5) (after transformation) and K = 3. Bessie can be at any location (x,y) where 0<=x<=6 and 2<=y<=8. This is a square region with side length 2K. Each grass patch will produce its own square region.

Now we are ready to line sweep. Just like in all line-sweep algorithms, we need to pick "event" points. Each grass patch essentially has 5 interesting pieces of information. If grass patch i is located at (x_i,y_i), then here are the 5 pieces of information that we care about:
1.  x-coordinate of the left boundary of the square formed (this was described earlier). The value is (x_i-K).
2.  x-coordinate of the right boundary. The value is (x_i+K).
3.  y-coordinate of the bottom boundary. The value is (y_i-K).
4.  y-coordinate of the top boundary. The value is (y_i+K).
5. The value of the grass patch. The value is g_i.

Using this information, we can create two "events" for each grass patch. One for entering the square region and one for exiting. We assign an event as a combination of 4 integers (x,y1,y2,g), where x is the x-coordinate of the event, y1 the lower boundary of the region, y2 the upper boundary, and g is the value of the grass patch (positive for entering, negative for exiting). The events are sorted by non-decreasing x-coordinate.

Essentially, one should imagine the line as Bessie herself. For example, if Bessie is located on the left boundary, and she is also in between y1 and y2, then she can reach the grass. Naturally, we should add this to the amount of grass she can reach if she is in the segment [y1,y2]. Thus, the value of g for this event is positive. Conversely, if she is on the right boundary, she is exiting the region, so we should subtract this amount of grass from the total reachable grass if she is located in the segment [y1,y2].

Here is where the segment tree comes in. We need to use a segment tree to find the maximum grass Bessie can be able to reach given that she is located at a certain x coordinate (but she can be at any y coordinate). Everytime Bessie visits a new event (in the course of our line-sweep), we update this segment tree, checking if the new maximum for this event exceeds the previous best.

How do we update the tree? Think of each node of the segment tree representing the maximum amount of grass reachable given that Bessie is located in the segment represented by that node. The x-coordinate, as we already know, is fixed when we update. Thus, for each event, we update by incrementing the segment [y1,y2] by g. Then the maximum for this event is given by the query for the root node, which gives the maximum over all y. 

Since we are always querying the root node, this suggests a heuristic - lazy propagation. Again, there are many tutorials on lazy propagation, so I will not go into much detail.


Hopefully, I have helped you understand better the solution to this problem as well as introduce an interesting application of the segment tree data structure.