pseudo-random function (idea) by randombit

A pseudo-random function (or PRF) is, basically, a polynomially computable function that 'seems' random. Somewhat more formally, it could be specified as:

f : {0,1}ⁿ {0,1}^m -> {0,1}^o

for some fixed n, m, and o. Basically it means that the function (f) takes as inputs two bitstrings, one of length n and one of length m, and produces one of length o (the output). We assume one of the inputs is secret, known only to those computing the function, while the other is a known (but not fixed) value. By fixing the secret, we have an actual PRF:

f_s : {0,1}ⁿ -> {0,1}^o

which, if f is a good PRF, will look like a random function (ie, a function which basically does a lookup into a randomly generated table). More specifically, if there is any way to distinguish this function (with unknown secret) from a random function in polynomial time, then it is not a good PRF.

PRFs are useful beasts. For example, they are used heavily in SSL/TLS and in most other cryptographic protocols. They are especially common for deriving multiple keys from a single shared secret. Say two programs shared a key via Diffie-Hellman, and they wanted both a cipher key and a MAC key. They might do this by keying a PRF with the secret, and then running the PRF twice with two different strings (for example, "MAC KEY" and "CIPHER KEY", or 0 and 1, or whatever). Then they use the two outputs for their respective purposes.

In fact, in some cases things that are not actually PRFs (like block ciphers) are used as PRFs. For example, when using a block cipher in OFB or CTR mode, you never use the cipher in the inverse direction. Thus, it doesn't really matter at all that the block cipher is actually invertible; it's used like a PRF (in fact, it's a pseudo-random permutation, but a PRF and a PRP and mostly interchangeable in this particular case).

Some example PRFs (not always good ones) follow:

f(x,s) = MD5(x || s)
f(x,s) = HMAC/SHA-1_s(x)
f(x,s) = AES_s(x)

though usually PRFs are a bit (or a lot) more complicated. For example the TLS PRF involves uses a recursive structure that uses both MD5 and SHA-1; the idea was to prevent a weakness in either hash from destroying the security of the PRF. Others, like the ones in IEEE 1363, are not much more complicated than the simple examples above.

At this time almost no research has been done to put PRF design in any kind of solid theoretical framework. We have a very good idea of how a PRF should act, but no real way to compare the properties of the PRFs we design, with what the theory says. Thus the designs are usually either rather weak, or are extremely paranoid (and slow) where it's probably not necessary. Unless someone smarter than me gets off their butt and does this work, we will continue to see ad hoc (and sometimes insecure) designs pop up again and again.

OFB	IEEE 1363	Message Authentication Codes	DESX
CTR	Cryptology	coffin nail	pseudo-random permutation
random walk	SSL	Block cipher	TLS
SHA-1	Egress	Cryptographic	um
exists	function	Random