A
phonetic system of
transcription that encodes the weird and wonderful symbols of the
International Phonetic Alphabet (IPA) into the simple typewriter symbols (
ASCII-7) available on your keyboard so that it can be confidently transmitted anywhere on the Internet.
The original SAMPA was devised for Western European languages and gradually extended. X-SAMPA is an extended version that encodes the entire IPA and thus can be used for any language in the world, and SAMPROSA is an extension for prosodic marking. SAMPA was devised by (among others) Professor John Wells of the Phonetics Department of UCL. Both systems will be discussed together here. (I assume a basic knowledge of the ordinary IPA. Hit the hardlinks for explanations of the phonetic terms, which would be intrusive here.)
All ordinary lower-case letters retain their IPA values. Here for reference are those that might not be obvious. The square brackets enclose a phonetic symbol:
- [a] is a low front vowel, as in Italian or Spanish; widely used in England for the sound in cat
- [c] is a voiceless palatal stop
- [e] is a mid-close vowel, as in French été
- [g] as in got
- [h] as in hot
- [i] as in machine
- [j] is a palatal approximant, as in yes, you
- [o] is a mid-close vowel, as in French haut
- [q] is the uvular stop of Arabic
- [r] is a roll, but may also be used for other rhotics such as the ordinary English R sound(s) or those of French and German
- [s] is as in hiss
- [u] is as in rule
- [w] is as in wet
- [x] is a voiceless velar fricative, as in loch, chutzpah
- [y] is as in German über, French lune
- [z] is as in zoo
The capital letters have the following values:
- [A] is a low back unrounded vowel, as in old-fashioned RP cart, some Americans' cot
- [B] is a bilabial fricative, IPA beta [β]
- [C] is a palatal fricative, IPA c-cedilla [ç], as in German ich
- [D] is a voiced dental fricative, IPA edh [ð], as in this
- [E] is a mid-open vowel, IPA epsilon [ε], as in yeah, French mère
- [F] is a labiodental nasal, IPA tailed-M
- [G] is a velar fricative, IPA gamma
- [H] is a front rounded approximant as in the u of French huit, lui, IPA overturned-h
- [I] is IPA small-capital-I as in bit
- [J] is a palatal nasal, IPA pre-tailed n, Spanish ñ
- [K] is a lateral fricative, Welsh ll
- [L] is a palatal lateral, IPA lambda, Castilian Spanish ll
- [M] is a back close unrounded vowel, IPA overturned-m, Japanese u
- [N] is a velar nasal, IPA eng [ŋ], as in sing
- [O] is a mid-open vowel, IPA turned-c, as in (British) fawn, Swedish å
- [P] is a labiodental approximant, IPA script-v, Dutch w
- [Q] is the back open rounded vowel of England English hot, IPA turned-script-a
- [R] is a uvular fricative, French r
- [S] is a postalveolar fricative, IPA esh, as in she
- [T] is a dental fricative, IPA theta [θ], as in thin
- [U] is IPA small-capital-U as in put
- [V] is a mid-open back unrounded vowel, IPA overturned-v
- [W] is a voiceless w, IPA overturned-w, English wh in those dialects where it's different from w
- [X] is a uvular fricative, IPA chi [χ]
- [Y] is IPA small-capital-Y as in German Mütter
- [Z] is a postalveolar fricative, IPA ezh, as in measure
All other typewriter
symbols get used eventually. Here are some that are used in more familiar languages:
- [@] is schwa, like A in about, U in circus
- [{] is IPA [æ]), the vowel in cat
- [}] is IPA [
u]), the vowel in Scots guid, Swedish sju
- [1] is IPA [
i]), the vowel in Russian Вы (vy)
- [2] is IPA [ø]), the vowel in French deux
- [3] is the vowel in standard England English bird
- [4] is the tap, standard American English T in water, Spanish R in pero
- [5] is dark-L, IPA l-with-tilde-through, as in will in most accents
- [6] is IPA turned-a, as sometimes used for final schwa in England, or Portuguese unstressed a
- [7] is IPA ram's-horns, Estonian õ
- [8] is IPA crossed-o, rounded central vowel
- [9] is IPA [œ]), the vowel in French neuf
Some symbols are escape characters. The backslash is used to make some. Often letter+backslash makes an IPA small capital, e.g. [p\] is the bilabial fricative in Japanese huzi (Fuji), and [J\] is the palatal stop of Hungarian Magyar. The underscore has the effect of a ligature to make even more complex symbols.
Primary stress is indicated by " (double quote, not single stroke), and secondary by %. Nasalization is indicated by a following ~ tilde, length by a : colon.
Examples (in my speech):
[D@ lINgwIst sO: D@ pelIk@n] = the linguist saw the pelican
[D@ nO:T wInd @n D@ san w@ dIsp"ju:tIN wItS @v D@m w@z D@ "strQNg@, wen @"lQN keIm @ "tr{vl@ r{pt In @ wO:m "@Uv@k@Ut] = The North Wind and the Sun were disputing which of them was the stronger, when along came a traveller wrapped in a warm overcoat. (N.B. This story is used by the IPA as a standard to show samples of the phonetics of different languages.)
Full system at:
http://www.phon.ucl.ac.uk/home/sampa/home.htm
http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm
On the Web we're not restricted to bare ASCII-7 but can use HTML entities, at least; so I often use a hybrid system with the more readable (if you know IPA) symbols [æ] [ð] [θ], created by writing æ ð θ, instead of [{] [D] [T].
By the way, the proper way to display IPA symbols as such on the Web, which however will not work fully on E2, because write-ups here don't support the <font> tag, is to enclose HTML numeric symbols in a font that many people are likely to have and that includes IPA. The recommended one for widest coverage is Lucida Sans Unicode. So
<font family="Lucida Sans Unicode">['stɹɒŋgə]</font>
should display a phonetic rendition of the word 'stronger'. If I use that in this write-up, however, what you see is:
['stɹɒŋgə]
Using IPA in Unicode: http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm