History and Use of the Diaeresis
or, More Than You Ever Wanted to Know About Two Silly Dots Above a Vowel

History

The concept of diaeresis comes from Latin verse. A verse of traditional dactylic hexameter consists of six metrons (or metra to be pedantic), where each metron is a single dactyl, ie. trisyllabic foot. For example, the first line of the Aeneid is:

arma virumque cano, Troiae qui primus aboris
which, when divided into metra, becomes:
arma vi | rumque ca | no, Troi | ae qui | primus ab | oris
As you can see, metra are marked off without regard to the underlying words, but sometimes they coincide, as in the boundary between the fourth and fifth metra above. Unlike a caesura, which is an intentional pause on a word boundary within a metron, this is mere coincidence and should be (almost) ignored when reciting. The mark for this is the original diaeresis (Greek for division), and it was denoted with two vertical lines, like this:
arma vi | rumque ca | no, Troi | ae qui || primus ab | oris
Romance Usage (Trema)

Skipping ahead a few millennia, this notation shrunk down into two little dots placed on the second character, also called the trema, and was borrowed into French and most other Romance languages for separating digraphs (multiple letters that form only one sound) and mere adjacent vowels, as in naïve, which is na-eve but would be read nayve without the diaeresis. English, in turn, borrowed this from French. However, in English the popularity of the diaeresis has waned during the last 100 years (why bother when English spelling is so screwy anyway?) and these days only a few bastions of "proper" style, most notably The New Yorker, still insist on the coöperation of their authors. Since this use of diaeresis does not modify the sound of the letter it is on, it is left out if the word is hyphenated in the same place.

There is one other subtly different usage, not known in English but found in (at least?) French and Spanish: if a diaeresis is placed on the first vowel of a digraph, as in French aigüe or Spanish antigüedad, it indicates that the vowel in question is sounded, not silent. (In French, this is the result of a recent language reform and many people still place the diaeresis on the second vowel.)

Germanic Usage (Umlaut)

And that's where the story ends for English (and Webster 1913), but the diaeresis has acquired many a new use in other languages, most notably as a representation for umlaut in Germanic languages (eg. German, Swedish, Norwegian, Icelandic) and Celtic. The original glyph for umlaut was a little e atop the modified vowel, but this eventually morphed to look exactly like the diaeresis.

To further confuse things, German vowels with a diaeresis on top were lifted wholesale into the orthographies of many unrelated languages (eg. Finno-Ugrian, Turkic, Albanian and even Chinese pinyin), which do not have the grammatical concept of umlaut. The unfortunate result is that many of the speakers of these languages speak of e.g. "umlaut a" when they really mean "diaeresis a", as can be seen from the utter confusion in the node by that name. (This usage is so widespread that it can no longer be called incorrect, it's just highly confusing.)

I meant to provide a summary of how ä, ö and ü are "usually" pronounced here, but after some thought came to the conclusion that the whole situation is just way too big a mess to summarize usefully, so please consult the individual entries for the characters or languages you are interested in instead. (ÿ is an especially odd little character.) Hungarian deserves a special mention though, since it distinguishes between a short 'ö/ü' and a long 'ö/ü' by diagonally stretching the dots into something resembling a quotation mark, resulting in ő and ű! Unicode terms this the U+030B COMBINING DOUBLE ACUTE ACCENT ( ̋) but don't be fooled, it's a diaeresis in disguise.

One last note of minor interest: whereas the Romance diaeresis is only punctuation and accented characters are usually not considered letters, characters with a Germanic diaeresis are almost always dealt with as separate letters of the alphabet when sorting and alphabetizing.

Slavic Usage

In many Slavic languages, the diaeresis is used in the character 'ë', the glyph for which is essentially identical in its Roman and Cyrillic representations, although Unicode separates them into U+0451 CYRILLIC SMALL LETTER IO (ё) and U+00EB LATIN SMALL LETTER E WITH DIAERESIS (ë). In Russian this is read as "o" or "io" depending on its position within a word.

IPA Usage

The International Phonetic Alphabet (IPA) has its own meanings for the diaeresis, which I happen to think are highly bogus. At any rate, in the IPA world, a diaeresis above a vowel means "centralized" and a diaeresis below (!) a vowel means "breathy voiced". These probably confuse even linguists.

Fake Usage

English speakers tend to think both that the diaeresis looks funny and that it can be plunked down anywhere in a word, resulting in names like Motörhead and the wonderfully perverse Häagen-Dazs. Nöt müch Ï cän säy äböüt thät, nöw ïs thërë?

Incidentally, in case you're wondering why both the examples above are impossible, read up on vowel harmony.

Representation

The diaeresis can be found in Unicode as U+0308 COMBINING DIAERESIS ( ̈) and also within the Latin-1 supplement as U+00A8 DIAERESIS (Latin-1 ¨, Unicode ¨). Ideally, all accented characters should be formed by using the combining diaeresis printed on top of the unaccented vowel, but for historical reasons Unicode includes all of ISO-8859-1 and its vast multitude of precomposed characters like U+00FC LATIN SMALL LETTER U WITH DIAERESIS (Latin-1 ü, Unicode ü). This is a kludge, but a necessary one for time being.

For HTML character entities, the diaeresis is systematically called an umlaut, or rather just "uml", as in ¨ (¨) and ä (ä).

Typographically, the Romance diaeresis is often denoted with smaller, lighter dots than the other types. Unicode, however, does not make this distinction.

Obscure diaeretic bugs in E2

You can't use a single high-ASCII character like ä as a node title, you have to use an HTML character entity like ä instead.

References

  • years of personal experience battling with software over the issue
  • http://www.unicode.org
  • http://www.hclrss.demon.co.uk/unicode/
  • http://czyborra.com/unicode/characters.html
  • http://www.skidmore.edu/academics/classics/courses/1998fall/cl202/resource/meter/metintro.html
And thanks to Albert Herring, Gritchka, thbz, Tiefling and tres equis for corrections.

Di*aer"e*sis, Di*er"e*sis . [L. diaeresis, Gr. , fr. to divide; through, asunder + to take. See Heresy.]

1. Gram.

The separation or resolution of one syllable into two; -- the opposite of synaeresis.

2.

A mark consisting of two dots [ ¨ ], placed over the second of two adjacent vowels, to denote that they are to be pronounced as distinct letters; as, coöperate, aërial.

 

© Webster 1913.

Log in or register to write something here or to contact authors.