These days, many computers sold are multi-programmed machines. The industry as a whole is pretty near the limit in terms of increasing the processing power of a computer by increasing the processing power of its single CPU. Instead, they have turned to multiprocessor (SMP), multi-core (CMP), or multi-threaded machines to increase throughput. As such machines become more popular, the average programmer has to deal with the synchronization issues involved in writing a multi-threaded application. Lock convoying is one of the many problems to which a lock-based synchronization system is prone, along with deadlock and priority inversion.

Suppose there are a set of threads doing similar things (worker threads in a web server, for example). Unless they spend a lot of time accessing shared objects, one might expect that accesses to shared objects might be spread out more or less evenly over time. Some threads will be near the beginning of their work cycle; some will be near the middle; some will be near the end. "Natural randomization" ought to keep contention for any one lock low.

Now suppose some thread gets preempted while holding a lock that the other threads will be using later in their work cycle. As it sleeps, the other threads will "catch up" and either spin or get descheduled waiting on that lock. When the preempted lock holder wakes up, all the other threads are running more or less in "lockstep" (pun intended). They will all arrive at the next lock at around the same time, resulting in high contention. Eventually they may get spread out again, until once again a thread gets preempted holding the lock, at which point the convoy restarts.