Basically there are two parts involved in breaking the Vigenère cipher that uses a key that is repeated often while encrypting. The first is determining the length of the key. The second is using frequency analysis on the letters decrypted by the same letters of the key.

To determine the key length first you look for sequences of letters that appear more than once in the ciphertext. The repetition could have been caused by the same letters getting encrypted using the same part of the key or by different letters getting coincidentally encrypted to the same letters. The longer the sequence of repetition the more likely it was caused by the first possibility.

If the repetition we have found is caused by the same parts of the key being used then the length of the key is a factor of the space between the repetitions. The length can be narrowed down by finding additional repititions and finding common factors in the spacing. Doing this we can make a pretty good guess at the possible length of the key.

Now frequency analysis can be used on every Nth letter of the ciphertext where N is the length of the key. So if our guess is that the key has a length of 10 then use the 1st, 11th, 21st... letter and perform frequency analysis on them. Similarly do it also on the 2nd, 12th, 22nd, etc. And so on for each letter of the key.

This technique works better the longer the ciphertext length and the shorter the key length. And once you have identified the key length it still has the same problems that frequency analysis does if the encrypted text doesn't follow normal frequency patterns.

This technique is rendered useless if the key length is the same length as the encrypted message. If the letters of the key are common words the Vigenère square is still breakable. Guessing common words that are probably in the text then trying them at various parts of the ciphertext might reveal possible words in the key. As possible parts of the key are revealed more educated guess can be made at what the keywords are and what the encrypted text is. This technique is more tedious and can be a long process, but it can work.

Finally if the key length is as long as the plaintext message and the letters of the key were selected randomly then we have a One-time pad that is truely unbreakable, but suffers from many problems including key distribution, key generation, and insuring keys are not reused.

Frequency analysis can be used to break any substitution ciphers, even if the key is arbitrarily long, provided you can assume some knowledge of the language of the plaintext and of the key. You need a computer, but the program is fast and easy; the harder part is knowing detailed facts about the language.

You could translate the German text of Das Kapital using the French text of Madame Bovary, and the same principles apply, but for ease of exposition I'll stick to English. Here the kind of knowledge you need is that the commonest word is THE, and this occurs in other common words such as THEN, WHETHER, OTHER, and that TH and WH are very common sequences, and so on.

In English text E has a frequency of about 13%, T of 9%, and so on down in a characteristic pattern. Any text QLPZJHWSNA... exhibiting this pattern is monoalphabetic and is crackable instantly. As Xamot has explained above, an n-letter keyword will create ciphertext that exhibits this same characteristic pattern at precisely every nth place, wherever you start counting from. A good example is at the back of Alan Garner's Red Shift: because it also preserves spacing and punctuation, it can be solved by hand. But in general you just feed it to a computer and test spacings of successive n until you hit the magic pattern. This is, as Xamot indicated, pretty trivial for any key length << the text length.

That depends on the encryption patterns being cyclically re-used. A one-time pad has no cyclic repetition. A random one-time-pad is totally uncrackable, unless you find the piece of paper it's written on. In fact it's pointless attacking the cipher: you might as well capture the agent who's already decrypted it and torture them until they tell you what it said.

The interesting middle position is a non-random one-time pad. This is crackable by the CIA and Echelon and Mossad, it's crackable by dozens of the people contributing to E2, if they devoted a bit of computer time to it and could be bothered, but it does resist attack by the ten-line program that solves cyclic keys. It keeps your kid sister out and the beauty and temptation is that it's so easy to do, no hard typing or programming.

The easiest way to get hold of an arbitrarily long key is to use a publicly available text. Chapter One of Pride and Prejudice is a terrible choice, because once someone thinks you're doing this, the first keys they're going to try are IT IS A TRUTH UNIVERSALLY ACKNOWLEDGED, followed by CALL ME ISHMAEL and IN A HOLE IN THE GROUND THERE LIVED A HOBBIT and any novels or lyrics they know mean something to you. But for the purposes of illustration let's say we're using Pride and Prejudice.

The commonest word, in fact the commonest three-letter sequence, in Pride and Prejudice, is THE. This is also the commonest three-letter sequence in the plaintext. The T-encryption of THE is MAX; the H-encryption is AOL, and the E-encryption is XLI. (See Vigenère cipher for the table.) So in the encrypted text the sequences MAX, AOL, and XLI are going to occur equally probably, and with far higher probability than the 1 in 26^3 expected by chance. These will stand out. You will also get significantly higher than random occurrences of the three THE-encryptions for the HEN of WHEN and THEN, the ILL of WILL, and for ING and AND and WHO and FOR.

If you try to decrypt according to these second-order frequencies, gradually you will build up simultaneous pictures of at least the grammar of both the keytext and the plaintext. if you're the one trying to keep the secret, I suppose picking a Chinese web page and using the hexadecimal image of that as your one-time pad would make it a lot harder for frequency analysis to gain any leverage.

Log in or register to write something here or to contact authors.