In this case of the general cryptography
problem, one has limited computational resources and so does your adversary. Everything you do must be simple enough to be done and undone with pencil and paper. Usually, a sender
tries to transform a clear text
message into a coded message
so that a receiver
can transform it back into the clear text, but any other listeners
, especially adversaries, cannot.
Classical encryption methods include two main branches: substitution, in which one symbol is substituted for another, and message rearrangement. Also, message hiding can be considered a form of encryption.
The simplest form of substitution is the letter cipher. For example, the archetypal kiddie code. When you encode, you find the letter on the left side and substitute the number on the right side:
a=1, b=2 ,c=3... j=10... z=26
Of course, it is possible to choose simple substitution ciphers that don't have such an obvious solution:
a=17, b=3, c=25, d=12... (no pattern)
But unless there is an underlying pattern these will be difficult to memorize. And no matter how clever the memorization scheme, a simple substitution cipher can be broken by anyone with enough time and a few relatively easy-to-generate tables.
Breaking a substitution cipher
How? Well, let's start with letter frequency. The most common letters in English are, in order, ETAONRISH. E is much more common than T, and T is markedly more common than A. After that, there's a bit of a mess. So, you look for the frequency of each of the cipher-letters, and try out the common ones as E, or T.
Also, one can use the spaces. What are the only two one letter words? A and I. If you see a one letter word, you know that that character represents A or I. One can also quickly compile the complete list of two letter words, and dramatically limit the number of possibilities. Similarly, one can look for common pairs of letters, especially double letters. This is sufficiently possible to do by hand that a brief message encoded in this manner can be found near the crossword puzzle in many newspapers (in mine it is labelled "cryptoquotes"). People solve this routinely. If your message is as long as this sentence, it can be reliably decoded by millions of amateurs.
Of course, one can easily get around these decryption tactics: strip your message of spaces, or put them in randomly. Split up E between two characters so that it is no longer most common. Intentionally word things strangely to avoid common words and letters. Misspell rampantly (yet unambiguously). Have characters which mean nothing, and randomly insert them into your message. Have a character which means 'repeat the previous letter', one which means 'repeat the letter before the previous letter', and one which means 'repeat the previous word that began with the next letter'. Treat I, J, and Y as the same letter (there are linguistic reasons that this will rarely if ever be ambiguous). Treat U and V as the same letter (same here). Treat Q and Z as the same letter (No good reason, but can you think of any examples?). Let there be no character for "Q", only a character for "QU"... the possibilities are endless.
Any of these ciphers will be stronger if you rotate the key so that what symbols stands for what letter changes throughout the message. One can do this easily with a Vignere Square or code wheel. If you always progress forward through your key and never repeat, it is a one-time pad. If the pad was completely random and secret, then it cannot be broken by any means. Cool!
Rearrangement is the second main class of classical cryptography schema. For example, take the message "help me now!". Now move every even character to the end of the message. If you count spaces as characters, you get "hl enwepm o!". To decode, split the message into two halves, then read the characters off into one message, alternating which half you read from. This is called the picket fence code. You can do this trick with longer cycles than 2. The case of a four-long cycle is called the compass code.
A classical method of carrying this out is called the scytale. One takes a cylinder (originally a staff or spear) of a particular width and wraps a strip of paper around it in a spiral. Then, one writes a message down the staff with one letter on each twist of the strip. After finishing one line, you twist the staff by an eighth of a turn (or whatever) and start back towards the beginning of the strip. With the wrong diameter of staff (i.e. the wrong length of cycle) this cannot not be read directly.
You can also read off the letters into a grid in some prearranged order, then read them off the grid in some different order. For example, read this message down the first column then up the second, down the third, so on, to find the clear text of this coded message.
Of course, when sending the code one wouldn't put it in this nice rectangular format for everyone to read. It would be sent as "dlmot sah eio o ladh igt rg nneroatasr iptt yt tii nely" and it would be up to the receiver to take out the spaces, put it into the rectangular grid four deep, and read up and down the columns.
These can become arbitrarily complicated, being an arbitrary element of Sn, the Symmetric Group on n letters. Sn has n! elements, and only a few will be even close to clear text. You have all the rest to choose from, and that's a lot! The smallest factorial smaller than a googol is 69!, and most messages will be longer than 69 letters long. Thus, there is a lot of room in which to mess around with the message. If the rearrangement code takes advantage of this expansion of possibilities, the longer the message gets, the better encrypted it is - as opposed to ciphers, which become less secure as the message becomes longer.
Of course, just as with substitution the trick is to pick a change that is is easy to remember. Any pair of correspondents who have good geometrical memory can remember these patterns more easily than ciphers, and they can be quite nasty to break. For example, one code could be filling up a chess board with letters in the order of a prearranged knight's tour, and then reading the letters out row by row.
For codes, you ignore the character content of the message, and only include symbols to refer to the meaning. For example, let's take a look inside a codebook that turns a deadly serious message into a whimsical note about Scooby Doo. Osama Bin Laden = Ralph; the Khyber pass = the supermarket; Al Qaeda troops = bananas; helicopter = the Mystery Mobile; stealth = oil. Muhammed Jamik (a secret agent working for the CIA) = Scooby; Missile Strike = haircut.
Given that, it shouldn't be too hard to decode the following seemingly innocuous (and idiotic) messge: "Scooby-Doo is with Ralph in the supermarket. Ralph has 12 bananas, and is going out to the Mystery Mobile, which has had an oil change. We think he needs a haircut." Of course, the message will need to be interpreted somewhat since the exact meanings don't make sense when substituted.
Codes require small vocabularies or large codebooks, while ciphers can pass almost any message given a small amount of memorization. However, ciphers can be completely broken much more easily than codes. Having 2 letters may indicate quickly what all the others are. However, even if a listener knew that the Mystery Mobile stood for helicopter, Scooby was some pesky secret agent, and oil meant stealth, the listener wouldn't know anything relevant about the message, except that there was someone with something going to a stealth helicopter somewhere, and something or other was to be done about it. The listener would probably think that this referred to an American operation! Furthermore, with a code, you can let common concepts be represented by brief symbols, and achieve compression as well as encryption. Back in the days of telegrams, companies would have codes which were only incidentally encrypted, and primarily compressed.
Lastly, one can hide the message so that a listener does not even know what the coded content message was, even assuming that the listener was even able to figure out that a message had been sent. For example, one can write a poem or story which appears to have no significance, but the content contains an encoded message. For example, the poems on memorizing pi. If you don't realize that the word lengths correspond to the digits of pi, you won't even think that it's a code - just a really bad poem. Such a method can also be used to hide password reminders in an physically insecure environment. That cute letter from your kid? The password is the second letter of each misspelled word.
One can also hide a message among elements in drawings, in typos in a newspaper edition, in invisible ink on an otherwise blank page, or in lights hung from a church steeple. In fact, this latter is one of the more famous codes ever - one if by land, two if by sea.
Now that there are computers to crack codes, rearrangement and simple substitution ciphers on their own are no longer tenable if your adversary has the appropriate resources. However, hiding messages can still be done quite easily, as can one time pads, or even clever ciphers whose letters are difficult for an unaccustomed human reader to distinguish. What do they feed into the computer if no two letters are exactly the same, but many are similar, differing in a hundred different ways, each of which may or may not be relevant?
Classical Cryptography has wide avenues and much room for creativity. The art arises when you try to balance the needs of a code in creating one. Does the code need to be easy to encode? Does it need to be easy to decode? Does it have to be brief, or can lots of garbage be thrown in to mislead the listeners? Does the code need to remain secure even when the general meaning of a message can be determined by adversaries? Does the rest of the code have to remain secure even when both the clear text and the coded text of a message fall into the hands of adversaries? That one is tough, but surprisingly, possible.