"... it is sometimes said to run in time O(n log n), but this is false"

Actually, this is true.

As joeytsai points out, quicksort "chooses an element (called the pivot), and divides the remaining elements in two groups - elements smaller and greater than the pivot." Quicksort's optimal running time of O(n log n) doesn't happen unless the choice of pivot roughly cuts the input in half each time. If we cut the input nearly exactly in half at every iteration, the running time of quicksort can be found by solving the recurrence:

T(n) = 2T(n/2) + O(n).

In other words, the cost to solve a problem of size n is equal to the cost to solve 2 subproblems of size n/2, plus some constant factor times n. The easiest way to solve this recurrence is with the Master Theorem: case 2 applies here. The solution is therefore O(n lg n).

The problem arises in that the "naive" method of splitting the input doesn't result in evenly-sized subproblems every time. However, with a better method of splitting the input, we can be guaranteed to achieve the optimal worst-case runtime. Note that we can split the array exactly in half if we always choose its median as the pivot. The median is also known as the n/2th order statistic, which can be found in O(n) time. Doing this at each step of quicksort does not increase the asymptotic worst-case running time; the recurrence above simply becomes

T(n) = T(n/2) + O(n) + O(n).

... but O(n) + O(n) = O(n), since O(n) basically means "n times any positive constant". Adding the median-selection step makes this constant higher, but the asymptotic running time of quicksort is still O(n log n).

On the other hand, the SELECT algorithm for finding the median is relatively expensive for an O(n) algorithm; i.e. the constant is high. For the best wallclock running time, you'd want to empirically determine at what input size the O(n lg n) quicksort given above is actually faster than the O(n^2) naive or randomized-pivot quicksort, and only call the O(n lg n) quicksort when the input is sufficiently large.


yerricde pointed out that heapsort's O(n lg n) performance may be better in this case. I tend to agree; there is a practical difference between the running times of, say, heapsort and quicksort, even if the "theoretical" running times are the same. In fact, NIST's directory of algorithms and data structures (http://www.nist.gov/dads/HTML/introspectiveSort.html) lists a type of sort called "introspection sort" and defines it as: "A variant of quicksort which switches to heapsort for pathological inputs, that is, when execution time is becoming quadratic."