Comparing the relative performance of different computers is extremely complicated. Naturally, with all complicated things, there is a drive to make it simple. Which, of course you can do, if you don't mind living in the land between ignorance and deliberate deception.

Comparing computers via CPU clockspeed (measured in Megahertz or Gigahertz, usually), MIPS, or other benchmark standards, are all good examples of deceptively simple benchmarks into which family comparisons based on word length, more simply known as "bits" (8-bit, 16-bit, 32-bit, 64-bit, and 128-bit are the common variants) fall squarely.

The car analogy is frequently understood to describe the benchmark problem. Most people seem to be asking, ultimately, how fast they can get the car to go, if they have a straight stretch of desert highway and put the hammer down. The often used benchmarks for computers are roughly equivalent to details like the number of cylinders the car's engine has, the size of the engine block, the horsepower of the engine, number of gears in the transmission. Obviously these will give you some idea about what's going on with the car, and may let you make a very rough comparison. In the hands of an expert, a variety of trivia can be the basis for a number of inferences based on history and design principles that will let you figure out the answers to specific, meaninfgul questions. However, there are clearly a number of factors in cars that can make all of the standard "benchmarks" quite deceptive. Perhaps the best known is the cylinders per engine case, where an 8 cylinder engine might have been a Cadillac or a Jaguar, 12 and over might be a race car or a truck, and 4 might be a Toyota. Or a boat. However, harrowing advances in engine design have enabled modern carmakers to do things with a four cylinder engine that you can barely do with six or eight, rendering engine cylinders useless as a measure of speed perfomance - for light cars. There is still the matter of torque and the gearbox; four cylinder engines will not be making an appearance in the Mack trucks of the world any time soon. However, for the average car buyer, counting cylinders is only good for figuring out how much gasoline your car will waste. And overall, the emphasis on top speed has lead the (ignorant, macho) observer to ignore other, more-apt-to-actually-make-a-difference-in-your-life factors, sometimes as obvious and critical as 0-60 times, braking distance, airbags, and, most important of all, the sound system.

Like all deceptive benchmarks, bit-counting has a basis in fact. Bits, in this context, refer to the number of on/off switches (called binary digits) which can be used to represent a number. A single bit can represent either zero or one. Two bits can be used to represent a number between zero and three. Eight bits, 256 distinct numbers. Sixteen, 65536. Thirty two bits yields 4,294,967,296. And so on, in powers of two. Floating point numbers (numbers which are not integers, and hence, have a .000 component, such as 4.1, or 3.141592654) are represented in a more complex way (often, according to a standard developed by the IEEE), but to gloss quickly, the available bits are divided between representing the two numbers before and after the decimal point, with a few left over to indicate sign, and the position of said point.

Overall, the design of CPUs and computers as a whole has evolved toward larger word lengths. As Alan Turing demonstrated succintly, there is no size of number that a 2, 4, or 8 bit computer cannot handle - it's just a matter of extra steps. Lengthening the word - by adding more bits - allows a computer to handle numbers of larger scale or with greater accuracy in one step; that is, with one (or fewer) instructions. Additional advances in instruction set design have in fact taken this process further, and allowed (since the majority of the time, 64 or even 32 bits of precision is far in excess of what is required) multiple, similar mathematical operations to be compacted into a single instruction so that they can be performed simultaneously. This is, in essence, what MMX, SIMD, 3DNow!, Altivec, and SSE/SSE2 are - trading the excessive precision of the system against speed in performing multiple operations.

As other writers here have pointed out, even refering to a particular "system" as "having specific number of bits" can be deceptive and meaningless, since there is certainly no rule stating that the system must have a uniform word length. Depending on the instruction set and the hardware at the programmer's disposal, there may be a variety of functional units inside the computer which operate at different precisions (or at different precisions for different operations).

From a silicon-level design standpoint, or from the point of view of the motherboard or component designer, a system may be able to chomp three 32 bit integers in one operation, but it may not even have a 32 bit data path between various essential system components (such as video interfaces, hard drives, and various classes of memory). Data paths may be smaller (rare), the same size everywhere (less rare), or a wild combination of larger and smaller sizes for everything (most every system ever made). Using longer data words requires physically more complex circuits (this works the way you might think - 8 bits requires 8 wires, 32 - 32, and so forth), and the complexity problem is geometric. Thus, larger word lengths can translate into more severe heat and power requirements very quickly. Conversely, smaller world length designs can leverage gains in simplicity and footprint into better and/or more consistent performance.

Different applications require different things from the CPU and the computer it lives in. Many operations are simply a zero/non-zero (boolean) comparison, in which all but one of your bits will be wasted. Many others will be comparisons on numbers never likely to be larger than 8 bits - a legacy of the days when that was an important limitation. It is only when performing a large number of complex mathematical operations that the word length becomes a serious factor - for instance, when watching a complicated Flash movie, or rendering a 3D scene. In these cases, large word length (together with other factors, such as the ability to do deep pipelining - a kind of simplified single-CPU parallel processing - and a good FPU - or several of them) will lend a tremendous speed increase. In fact, many of today's personal computers are being optimized specifically for these kinds of tasks.

However, there are a variety of factors which can influence performance far more dramatically than word length. Most fundamental being overall system design. There are many 8-bit systems that will blow the doors off a 128-bit system... depending on the application, as always. There are also 64-bit workstations which are not nearly as fast as modern 32-bit workstations - even on an equivalent application. The quality of compilers, operating systems, and drivers are usually the biggest culprit after this - often wasting 50-90% of your computer's resources with traditional commercial-grade pass the buck programming. Beyond this, performance bottlenecks frequently occur on the motherboard, not in the CPU - in communications with memory, the video adapter/hardware 3D accellerator, or the hard drive, as well as more subtle problems, such as a shortage of memory - main memory, 3D accellerator memory, or cache memory of various kinds.

Overall, then, the number of bits a computer "has" is one relatively small factor of a very, very complex equation. And be very careful when you talk about a computer's "speed." Yes, different computers may be "fast" or "slow" - but the context is everything. Some computers will be as "slow" with 500 concurrent users as they are with 1 - and others will appear blindingly fast but choke if asked to deal with the slightest resource contention. Some are really, really good for 3D games, and others really really good for 3D modelling, for instance.

Perhaps the best metaphor is one many computer users are familiar with from adjusting their display settings. Yeah, there's 8-bit (256 color), and 16-bit (Thousands of colors), and 24 or 32-bit (Millions of colors). Occasionally, some graphics cards offer even more... but most do not. And you know the reason why, don't you? Because after 16 bits, and certainly after 24, it's all just diminishing returns. Your eyes just can't tell the difference.