Graham's scan is an algorithm used to find the boundary on a set of points that form a convex hull. Invented in the early 70's by a person called Ron Graham, it is one of the earliest algorithms used in the field of computational geometry.

The algorithm has been proved to be the most efficient possible, with a time complexity of O(n log n).

The only calculation used other than an initial sort is a (simple) function to determine whether a given point lies on the left or the right of an existing line between two other points. This is covered in tes' writeup in computational geometry.

The ''scan'' bit of the titular algorithm comes from the manner in which the points are processed. With the points laid out on a standard Cartesian plane, you can visualise a vertical ''scan line'' moving left to right, with the algorithm dealing with points as they hit the scan line. Algorithms that ''scan'' a set of points in order to do something to them are common in computational geometry. You normally come across them doing things in close to O(n log n) time that a naive algorithm would do in O(nx).

In this case, the scan moves left to right and then back again to the left, for reasons explained shortly, just after the end of this paragraph in fact. The entire algorithm is as follows:

The input is a set of n points P (with the standard Cartesian coordinates), and the output is the set of points Q that form a convex polygon with all points in P either in or on its boundary1.

  1. Sort the points in P according to x-axis, smallest first. If two or more points have the same x-coordinate, sort those points by y-axis, smallest first.

    This is something that can be farmed out to a sorting algorithm such as quicksort. The end result is that P is a sorted list of points.

  2. Move the first and second points from P to Q.

  3. Do this:

    1. Move the next point from P to Q.

    2. If the angle of the arc formed by the last three points in Q turn anticlockwise then:

      while there are 3 or more points in Q, AND the last three points in Q turn anticlockwise:
      delete the point just before the one you added in step 3.1.

  4. While there are still points in P, repeat step 3.

What's being done here is quite simple. Step 2 simply starts us off. Steps 3 and 4 form the main loop, and tests if the previous point added is in or on the hull. For example, if we take these first three points:


Points a and b are added to Q (our final list) in step 2. Point c is then added, and step 3.2 finds that the line formed by the points (a, b, c) turns anticlockwise at point b. Point b is then deleted, like the anti-clockian scum it so plainly is. Step 4 makes us repeat step 3 again, and point d is added:


This time the last three points in Q, (a, c, d) form a clockwise turn, so we the loop in steps 3-4 marches onwards, until we are at the right-most point.

The points we are left with in Q form the upper half of the boundary. To complete the hull we compute the lower half of boundary. We start again with a sorted list P, just like in steps 2-42.

  1. Move the second-to-last point from P to Q.

    Technically speaking we should add the last point first as well, but step 3 above should have already added this the last time it was used.

  2. Do this:

    1. Move the last remaining point from P to Q.

    2. If the angle of the arc formed by the last three points in Q turn clockwise then:

      While there are 3 or more points in Q AND the last three points in Q turn clockwise:
      delete the point just before the one you added in step 6.1.

  3. While there are 2 or more points in P, go back to step 6.

    Instead of saying ''While there are still points in P, go back to step 5'', we make sure not to duplicate the first point in Q, which was the first point added to Q in step 2.

Steps 5-7 are almost identical to steps 2-4, with two differences:

  1. We add points from the end of the list instead of the beginning.
  2. We delete points that cause clockwise instead of anticlockwise turns.

This gives us the bottom half of the hull. The resulting list of points in Q begins with the left-most point, and contains every point on the boundary.

And that's it. What makes this algorithm so efficient?

The first step is the sort, which need only be done once2. Sorting is O(n log n). For each point in P, the loops at steps 3-4 and 6-7 are performed once each - a constant overhead.

The remaining question is how the time taken for the while loops in steps 3.2 and 6.2 to execute will change as the size of the input increases. Here, we can see that each point in P can be deleted from Q twice at most - also a constant overheard.

So the main steps of the algorithm are O(n), which we can then ignore when computing the order of the algorithm. Thus proving that the complexity of computing the convex hull of a set of points is equivalent to sorting.

1: The reason I'm assigning letters to these sets (or lists, to be pedantic) is because referring to them as ''the first/second set'', or ''the input/output set'' will eventually confuse me.

2: When actually coding this, you'd use the list sorted in step 1 in both steps, with a for loop or an iterator. The reason I'm deleting from P and then magically filling it again instead is so I don't have to explain for loops or iterators.

Computational Geometry: algorithms and applications
  2nd Ed, de Berg et al, 3-540-65620-0
Any mistakes are mine, not due to above sources.