Let L be a regular language. That is, L is a set of words which is recognised by a finite state automaton (or, equivalently, by some regular expression). Consider a DFA recognising L, and suppose L is infinite. Suppose a DFA D recognising L has k states. We may write the steps D takes when recognising a word w=w1w2...wn above the letters:

This just means that after reading w1...wk, D is in state sk.

Then for a word w in L of length >k, due to the pigeonhole principle, some state s must repeat itself: s=si=sj, i < j. Then D must also accept

and so on. In fact, if we write x=w1...wi, y=wi+1...wj, z=wj+1...wn (x and y may both be empty), then w=xyz, and for all nonnegative integers m, the word xymz is also in L.

We've just proved

[The pumping] lemma: Let L be an infinite regular language. Then there exists some k such that if w∈L has more than k letters, then w=xyz where
For all j≥0, xykz∈L.

The pumping lemma is useful for proving certain languages are not regular. This approach is usually taken in introductory computation courses (see e..g pumping lemma proof that the balanced braces language is not regular). However, in practice the Myhill / Nerode theorem is much more useful for proving that languages are not regular.

Pumping lemmas also exist for CFGs (context free grammars); this is noded below.