A few notes about RSA from a practical perspective. This writeup is not intended to cover all of the various bits of history and details of RSA, simply because there are half a dozen other writeups in this node that seem to be doing a fine job of that.
The Modulus
The modulus n does not actually have to be equal to p*q with p and q prime. It can be the product of any number of prime numbers (ie, any composite number will work fine). Of course, this makes n easier to factor (by how much is not really known; most factoring estimates are based on factoring p*q). The benefit comes when using CRT-based RSA decryption. This method replaces an exponentiation modulo n with 2 exponentiations modulo p and one exponentiation modulo q, plus a few extra operations. Since the time for a modular exponentiation is exponential with the size of the operands, this is a major win for speed, and almost all crypto software, and most hardware, does it in this manner. If n has more factors, you can break the decryption up even more, using many small modular exponentiations instead of one big one. This is generally termed multi-prime RSA, and has been patented by Compaq in the US.
Generally you can get away with between 3 and 6 factors, depending on the total size of the key. The really nice thing about this is that you can give someone your public key, and it looks exactly like a regular RSA key. The only way to tell it's not is to factor the modulus.
The Public Exponent
The public exponent is chosen to be small for several reasons. First, it makes encryption and signature verification very fast, which is nice. In particular, someone will only sign something once, but it might be verified many thousands or even millions of times. More importantly, if the decryption exponent d is too small, then it's possible to recover the key. If e is relatively small (say, less than 232), d will always be sufficiently large to prevent the attack.
Commonly chosen values for the exponent include 3, 17, 257, and 65537. These numbers are all primes of the form 2n+1. Because there are only two 1 bits in the binary representations of these numbers, the exponentiation algorithm is quite fast (this is due to the properties of many common exponentiation algorithms). Prime numbers for the exponent are preferred because p-1 and q-1 must be relatively prime to e (this restriction also applies to additional factors if multi-prime RSA is used). It's easy to see that (assuming p and q are always chosen randomly), the restriction will be easiest to satisfy when e is a prime number.
Converting messages into numbers
Most examples of RSA (including the ones included in this node), basically ignore the problem of encoding a message into something usable by RSA. Many 'toy' examples, designed to showcase the mathematics involved, suggest turning the string to be encrypted into an integer by concatenating the ASCII representation of the characters.
A common size for an RSA modulus is 1024 bits (or 128 bytes). RSA cannot encrypt a message (really, RSA encrypts integers), larger than the modulus. So what about when we want to encrypt a 1 megabyte file? A single encryption using a megabyte long RSA modulus would take quite a few minutes on a fast machine, so that doesn't seem too great.
An alternative is to break the message up into pieces and encrypt them one at a time (in fact, some books suggest this as a serious option). But this too will be quite slow, and additionally insecure. For example, an attacker could undetectably rearrange the RSA encrypted blocks. Unfortunately, some crypto libraries, like Java's JCE, actually support using RSA in this manner (most do not). The real solution is to generate a random key, encrypt it with RSA, and then send the recipient both the RSA-encrypted key, and the message encrypted with a symmetric cipher such as DES, using the key.
Even this doesn't solve all of the problems. It's quite possible in some situations (such as a server that decrypts a message encrypted with RSA and processes it somehow) to attack RSA, if it is used in the manner mentioned above. These attacks usually allow the attacker to gain the decryption of a small number of ciphertexts. They are fairly hard to pull off, requiring a good deal of CPU time and the ability to send many thousands, or even millions, of messages to the server. However, they are much easier to accomplish than factoring the modulus.
To prove that this should not be taken lightly, I'll point out that attacks against SSL have been published which can recover a SSL session key in a fairly practical manner. However, because it would require connecting many many times to the server, it would undoubtably be noticed by the resident network engineers.
The way to work around this problem is to first choose the random key, and then encode it in some way that provides redundancy, and ensures that the input to the actual RSA operation is only a little bit smaller than the modulus. There are many ways to do this, some of which have been standardized in PKCS and IEEE 1363 (in particular, read OAEP).
Similar issues surround the use of RSA for signatures. IEEE 1363a has some solutions that seem fairly workable -- unfortunately, the best ones are patented.
This Writeup Sucks
This has been a brief and most incomplete coverage of some of the practical issues surrounding the use of RSA today. If you have comments or suggestions for additions, please, /msg away.