A stream compression algorithm capable of compressing a random stream to a mean length within two bits of its entropy. Conversly, optimal block coding algorithms such as Huffman codes can potentially have an overhead of up to one bit per symbol encoded.
Arithmetic coding uses an oracle to estimate the probability of each string which can be produced by the stream. The strings are then assigned a subinterval of the interval 0->1 in lexicographic order. Output strings from the stream can then be represented by a binary string which uniquely specifies the relevent subinterval. It can be shown that the binary string produced will have a length of no more than two bits greater than the information content of the encoded string, hence the mean length lies within two bits of the entropy].