In phonetics, any aspect of a sound affected by the larynx is called phonation. In the simplest case this is voice. Voiceless sounds include [p t k s ch f]. When the vocal folds in the larynx vibrate, voice is produced. The corresponding voiced sounds are [b d g z j v]. Normally all vowels are voiced, as are the nasals [m n] and the liquids [l r].

English is typical in that the contrast between voiced and voiceless sounds is very common in languages, and voiceless vowels, nasals, or liquids are rare. However, the so-called voiced and voiceless pairs of English are pronounced slightly differently from those of French. Once you get into exact details of pronunciation, that is phonetics rather than phonology, the situation becomes very complex. The description that follows is still much simplified.

The larynx is a box made of two interlocking curved pieces of cartilage, one at the front and top, one at the back and bottom. From front to back stretch the vocal folds (or vocal chords or cords). These can be open to allow the passage of air, or closed. They are controlled by two small objects called the arytenoid cartilages, at the back. When these spread apart, a narrow V-shaped space is formed between the vocal folds. This is called the glottis.


Voice is created when the vocal folds are closed, and the air stream from the lungs continues to press against them at high speed. They are forced open. Continued muscular tension tries to close them and is aided by the Bernoulli effect lowering the pressure behind the released air. This is one pulse of voice. The continuous cycling of opening and closing is the vibration that constitutes voice. In men the frequency is typically around 100 to 150 Hz, in women twice that, in children three or four times that.

This fundamental frequency is called F0, with F for formant. Individual sounds are characterized by higher-frequency energy created up in the mouth: the vowels are distinguished by the frequency of their formants 1, 2, and 3, with the laryngal component F0 contributing the pitch of the sound. Fricatives like [s z sh f v th] have diffuse energetic bands of higher frequency caused by turbulent flow around the tongue or lips.

This is the normal form of voicing. By stretching the glottis from front to back frequencies of more than twice the usual range can be made. This is falsetto, and is not used as an ordinary linguistic effect in any language, but may be used as a cultural marker of excitement. This is called a paralinguistic use. When the vibration is dropped to less than half its usual rate it's called creak. This is sometimes used as a component of one of the tones in tone languages. Or perhaps it's creaky voice and a combination of creak and normal voicing. I get very confused around here.

Murmur or breathy voice is when the vocal folds are only held laxly together. The vibration is combined with the turbulent effect of audible breath. Some tone languages have one of their tones with murmur. Any voiced sound can be murmured. In many Indian languages descended from Sanskrit there are murmured stops [b d j g] as well as plain voiced ones.

Breath and whisper

Voiceless sounds are normally made with an open glottis. However there are two ways of having the glottis partly constricted without producing the regular vibration of voice. If the glottis is narrow enough, the air stream suffers turbulent flow. This is what we hear as [h], a glottal fricative. In words like ahead, behind where it's surrounded by voiced vowels, the [h] is usually voiced as well as breathy, that is murmured.

Whisper is a much more intense turbulence in the glottis, which may be either much narrowed, or reduced to a triangular opening at the back, by swivelling the arytenoid cartilages around. The smallness of the passage is why stage whispers can be made piercingly loud.

Many sounds that are usually voiced, such as vowels and nasals, do have voiceless versions under some circumstances. In some languages, such as Japanese, vowels are devoiced between (some) voiceless consonants, as in hito, kita. In French, liquids are devoiced at the end of a word after a voiceless sound, as in mettre, peuple. In Cheyenne whispered vowels are a normal possibility, not just conditioned by their environment.

Devoicing and aspiration

Sounds have a duration. They potentially have onset, nucleus, and offset portions that overlap or are affected by neighbouring sounds. In French the voiced and voiceless segments are cleanly divided, but in English the voiced sounds are partly devoiced. The voice doesn't begin at the start of [bat], and may be turned off before the end of [tab]. It is not a complete devoicing: these are not [pat, tap].

In English [pat] the voicing of the vowel doesn't begin immediately. After the [p] is a short segment of breath, that is [h]. This is called aspiration, and the narrow phonetic transcription for the sound shows this as [phat].

In those Indian languages with murmured stops, the murmur also continues part way into the vowel, so they're written bh dh jh gh or in phonetic notation [bh] etc.

In Icelandic a voiceless consonant causes (in some contexts) the voicelessness to begin on the offset of the previous vowel, giving an effect called pre-aspiration.


The two [h] sounds aren't single consonants; they're modifications of the neighbouring vowel. The syllable [ha] is [aa] with a voiceless first half. The only sound purely made in the larynx is the glottal stop [?], which is an abrupt closure and release of the glottis.

The larynx as a whole can move up and down. Moving it upward causes the air trapped above to be compressed, so when the mouth closure is released the result is more explosive: this is called an ejective sound. The reverse, lowering the larynx to give a hollower sound on release, is called an implosive. These are the two types of glottalic airstream mechanism. See those nodes for more detail.

There are two pairs of terms that are much abused. One is tense and lax, the other is fortis and lenis (i.e. strong and weak). I am inclined to regard them as meaningless, or at most vague, impressionistic cover-all terms for use when you don't know the true details. There are all sorts of muscles doing all sorts of things: tightening the pharynx, propelling the tongue tip, holding the edges of the vocal folds together, distorting the shape of the larynx. Which, precisely, of these is tense in a tense sound? Which actions are done more strongly in a fortis sound?

No speech sound is simple, and any one is likely to have multiple ancillary features. Witness the aspiration of [p] and the devoicing of [b]: in English the voiceless sounds are more "tense" and the voiced are "lax". But this relationship may be different in other languages, and causes confusion if applied to Korean. Fortis consonants, such as exist in North Caucasian languages, are typically longer in duration.

John Laver, 1994, Principles of Phonetics, CUP, helped make me a little bit less confused.

Pho*na"tion (?), n. [Gr. the voice.]

The act or process by which articulate sounds are uttered; the utterance of articulate sounds; articulate speech.


© Webster 1913.

