16-bit characters (as in
Unicode) is enough to represent
any single
language. What it isn't enough to do is represent all languages at the same time, enabling you to mix various
asian languages in the same
document(1). In order to facilitate this, the asians use so called
shift codes (e.g,
Shift-JIS) - some values in the strings that would normally contain character codes are defined to be an
escape. When this escape comes, the next values are read from the string and combined to find the actual character to use. This allows an arbitary number of
bits per
character, but is a pain to program with.
(1) If I remember correctly, there is an extra constraint, too: People are unwilling to have the same glyph (graphical symbol) encode to the same value when it has different semantic meanings. If we were encoding english to one value per word, that would be the same as wanting a different value for the to in "Go to London" and the to in "To be or not to be."