Compression is the art of representing data using less bytes than originally needed.
This can be achieved by only reproducing the Information theory. For example, the string "aaaaa" doesn't really contain more information than "aaa". Representations might be "5a" and "3a". As you can see, the compressed length is equal, reflecting the fact that neither string contains more information.

The act of expressing a message using as little media (i.e. as few symbols in an alphabet, usually bits) as possible. This is done by removing redundancy; the resulting message says as little as possible while retaining all of the information of the original. Information theory uses entropy to describe how much information a message contains, and thus how large it is when compressed. The opposite of compression is error-correcting coding.

Compression also refers (from a material standpoint) to loading that will tend to push the particles of the material together. This is in contrast to tension, loading that will pull particles of the material apart.
Some examples of this would be the force applied by your feet on the floor, pushing on a surface, or load on a column. Concrete tends to be very strong in compression.

This write-up will explain about compression in the context of mastering audio (mainly for music, but this sort of compression is also used in films). It's not to be confused with regular data compression (such as you might do with gzip or bzip2) or even the similar digital compression of waveforms stored on a computer (such as you might do with Ogg Vorbis or FLAC).

What compressors do

When something is too loud, you're likely to turn the volume down to compensate. A compressor is essentially a simple tool that automates this process, with a much faster reaction time than a human could have. Once a signal reaches a certain volume, the compressor attenuates it. When it's back below that volume, the compressor leaves it alone.

Threshold and ratio

The two main settings in a compressor are the threshold and the ratio. The threshold is the volume at which the compressor kicks in. When the signal is below that threshold, it is left unaffected, but when the signal is above the threshold, the compressor makes it quieter.

The ratio is how much quieter the compressor makes the signal. Specifically, the ratio is how many decibels above the threshold the signal needs to be in order for the output to be 1dB louder. For example, with a 3:1 ratio, a signal 3dB above the threshold would only end up 1dB louder after going through the compressor, and a signal 6dB above the threshold would only be 2dB louder.

Any ratio above approximately 10:1 is considered to be what's known as limiting, or a brick wall approach: the signal cannot get above the threshold at all, no matter how loud it gets.

Soft-knee versus hard-knee

Traditional compressors leave the signal completely unaffected until it reaches the threshold, then apply the exact ratio to the sound once the threshold has been reached. This tends to sound unnatural as the compressor has gone straight from one extreme (doing nothing) to the other (attenuating at the exact ratio) abruptly. While this can sound interesting in its own right, it is hardly subtle, hence the phrase hard-knee. The soft-knee approach was worked out as a smoother alternative: the compressor slightly attenuates the signal before it reaches the threshold, slowly building up until the threshold is reached and the full ratio is applied.

Attack and release

Compressors can be fast enough to ensure that even most sounds which rise in volume very quickly can be caught in time. However, there are some occasions when it's desirable to let a few quick bursts of sound sneak by unaffected. The attack setting is just for this purpose: it makes the compressor slow to respond to the waveform exceeding the threshold. As you might expect, the release setting is used to make the compressor slow at letting the signal pass through unaffected again. A fast attack setting can be useful to emphasise the percussive nature of drums, for instance, whereas a slow release setting is invaluable to make sure the compressor won't interfere with individual cycles of low frequency waveforms such as bass guitars.

Sidechains

There are rare occasions when the signal being monitored to see when the threshold is reached and the signal actually being attenuated by the compressor should be two different waveforms. Modern compressors sometimes have a separate input for the sidechain (the part which monitors the waveform to see if the threshold has been reached) for this reason. DJs can send their voices to the sidechain and their records into the regular input, which makes the music quieter when they talk over the top of it. A much more cunning trick is to have a very small delay between the two signals, so that a compressor can react to a peak before it reaches its main input. You could also use it to attenuate backing vocals when the lead vocal is sufficiently loud. If you use a compressor which has a separate input for the sidechain, try experimenting with it.

RMS versus peak detectors

Most compressors have RMS (root mean square, not Richard M. Stallman) level detection circuitry to work out the volume of the signal. This allows them to let loud but short peaks through, which is closest to the way the human ear perceives sustained sounds as louder than quick bursts of noise. There are times, however, when you may want to control even those quick peaks, which is why some compressors have peak level detectors. These respond to any signal peaks, even very short ones. Peak detection tends to work best with percussive sounds, whereas RMS level detection is usually the best one to go for with smoother sounds.

Full-band versus multi-band

Most compressors are full-band compressors: they affect the whole waveform at once. The main problem with this, at least when compressing complete mixes, is that low, bassey sounds can sometimes cause instruments in the midrange or treble to be dragged down when the bass instrument makes a loud noise, as the whole mix is turned down as a result.

The solution to this is multi-band compressors, which first split the signal up into several components - the bass, the midrange and the treble, for instance - and process each of these individually before putting them all back together again. In effect, they are several compressors rather than just one, independently working on different frequencies.

Full-band compressors are much simpler and easier to use - you don't have to worry about instruments half way between two bands getting mangled by two different compressors - but sometimes a multi-band compressor can come in useful. Incidently, the same technique of splitting up a signal into different parts is used by vocoders, but they do it for completely different purposes.

Imperfect analogue circuitry

Nothing in real life works like it's supposed to on paper, and compressors are no exception. There are several different electronic components that can be used to construct compressors, and each have their own little quirks. This can make real compressors sound very different to one another, even if they appear to have similar specifications. However, as can often be the case with musical instruments, a little imperfection can sometimes be more pleasing to the ear. If you're thinking of buying a vintage compressor, it can be worth listening to several different models to find out which one appeals the most to you.

References:
http://www.soundonsound.com/sos/1997_articles/apr97/compressors.html

Com*pres"sion (?), n. [L. compressio: cf. F. compression.]

The act of compressing, or state of being compressed.

"Compression of thought."

Johnson.