Chunking was first introduced by Harvard psychologist George A. Miller in his article "The Magical Number Seven, Plus or Minus Two", The Psychological Review, vol. 63, 1956. The article was called this because Miller's theory, backed up by piles of research, was that the short term memory could only remeber 7 units of information plus or minus two at a time. The idea followed that if you chunked information based on similarites you could take in more information at a time. Another idea was that if you tried to give anyone more then 9 peices information at a time to remeber it was pointless.

from the summary in The Magical Number Seven, Plus or Minus Two

" First, the span of absolute judgment and the span of immediate memory impose severe limitations on the amount of information that we are able to receive, process, and remember. By organizing the stimulus input simultaneously into several dimensions and successively into a sequence or chunks, we manage to break (or at least stretch) this informational bottleneck."

from the Superior Curriculum Project

"Chunking enhances working memory, not by increasing the number of items you can hold in working memory, but by increasing the size or complexity of those items."