The vowels of English vary a great deal between dialects and accents, and can be hard to describe. Consonants are comparatively easy. We can talk about the ZH sound in 'vision' and 'leisure', and it is clear what sound is meant, even though the spelling 'zh' is never actually used in English. Vowels however often defy neat respelling.

One possibility is to use a precise phonetic notation such as the International Phonetic Alphabet or its ASCII format SAMPA. The problem is that that ties you to a precise accent. For me, the sound [a:] is the vowel in 'bath', 'palm', and 'start', but for many other English-speakers those three words have two or even three different vowels. Conversely the same sound [a:] is what some Americans use in 'palm', 'lot', and 'thought', which for me have three different vowels.

But however differently someone from New Zealand or Alabama or Edinburgh pronounces the vowel in 'kit', they almost always agree that it's the same vowel as in 'busy' and 'women' and 'sieve' and 'pretty'. They might or might not pronounce 'been' or 'myth' with this vowel. A lexical set is a large group of words that have the same vowel as some representative one. In this case, 'kit' is chosen, and we talk about the lexical set KIT. All of 'lip, rich, mist, hyssop' have the KIT vowel. The DRESS set includes 'dress, bet, send, friend, death, bread', and so on.

The representative words such as KIT, DRESS, TRAP, STRUT were chosen (by leading phonetician Professor John Wells) because within any one accent there is no disagreement about them; and they don't sound too close to any other word. (If we used PIT, PET, PAT, PUTT there's more chance of confusion.) So we can then classify debatable or variable words by comparing them to the standard: we can ask, do you pronounce 'been' with KIT or with FLEECE? Most people match 'myth' to KIT but a few match it to PRICE.

There are more lexical sets than are needed to describe any one accent, because there are a number of groups that behave like one vowel in one accent and another vowel in another accent. So for example there is the TRAP set (cat, mat, bad, plaid) and the PALM set (calm, bra, father, rajah), and in between them the BATH set (half, class, past, dance, plant). No-one pronounces the BATH set differently from both the other two, but the BATH words are like PALM in the south of England, and like TRAP in America and the north of England.

The BATH set is a large set. There are also of course a few individual idiosyncrasies. The words 'lever' can be pronounced with either DRESS or FLEECE, and 'leisure' likewise but with opposite choices. Such one-off differences don't merit their own lexical set.

Not all accents can be captured neatly like this, but most can. Sometimes a bit of extra explanation is needed. In Australia part of the BATH set (half, class, path) is pronounced like PALM, as in southern England, as are some of those before an N (aunt, can't), while others before N (dance, plant) are usually like TRAP. And the word 'castle' can be either. In parts of America the KIT and DRESS sets aren't distinct before an N, so 'pen' and 'pin' are the same. Modern London accents have turned L at the end of a syllable into a vowel, so we get lots of new diphthongs in 'fill' and 'well' and 'ball' that don't fit the lexical sets neatly. Some north English and Welsh accents have different vowels in 'rain' and 'reign'. And so on: it's not a perfect system; it just covers the main variations.

Here's the full list of lexical sets, after which I'll mention which ones are pronounced alike in some common accents. Most people will also agree with the grouping of other example words I've put after them, where I've used a variety of spellings. Inevitably there'll be some people who can't agree with all these classifications; and in some cases you might even find the key word is outside the class: e.g. an accent where 'palm' isn't in PALM.

  • KIT big, build, women, busy
  • DRESS met, friend, head
  • TRAP cat, back, hag, plaid
  • STRUT bun, Doug, come, blood
  • LOT gone, rob, wash
  • FOOT put, woman, could
  • FLEECE deed, beat, shield, machine, receive
  • FACE bake, gay, raid, reign, great
  • PRICE time, lie, buy, light
  • CHOICE boil, boy, Freud
  • GOOSE food, use, move, who, flu, beauty, tune*
  • GOAT robe, so, though
  • MOUTH bound, how, bough
  • BATH class, laugh, glance, answer
  • PALM bra, father, llama, lava, aha, bazaar
  • CLOTH soft, cross, floral
  • THOUGHT taut, caught, jaw, hawk
  • NEAR beard, fierce, weird
  • SQUARE care, chair, bear, their
  • NURSE turn, bird, earth, serve
  • START dark, large, sergeant
  • NORTH sort, hoard, war
  • FORCE port, floor
  • CURE lure, tour, boor, Europe*
  • and weak (unstressed) vowels...
  • COMMA sofa, China, about, banana
  • LETTER diner, beggar, martyr, visor
  • HAPPY silly, Tony, merry

CLOTH: Never have their own vowel separately. Some people group them with LOT, other people with THOUGHT.

BATH: Never have their own vowel separately. Some people group them with TRAP, other people with PALM.

NORTH and FORCE: The same for most accents. Some Scottish and American accents distinguish them. If so, NORTH is THOUGHT + R, while FORCE is GOAT + R.

RP and other southern English: BATH = PALM = START. Also THOUGHT = NORTH = FORCE. The CLOTH set = LOT for most people, but = THOUGHT for some older upper and lower classes. CURE is merging or has merged with THOUGHT/NORTH/FORCE. Unaccented COMMA = LETTER.

General American: BATH = TRAP. Also LOT = PALM. For 40% of North Americans THOUGHT = LOT/PALM.

Scottish: TRAP = PALM = BATH. Also GOOSE = FOOT (as also in some north of England). Scottish and Irish accents typically have more than one NURSE vowel, depending on the spelling (er/ir/ur).

North of England: FOOT = STRUT.

Rhotic accents, such as General American, Scottish, Irish: those sets that have an R in the spelling (START, NORTH, NURSE, NEAR, SQUARE, FORCE, LETTER) generally aren't separate vowels, but are some other vowel + R, e.g. START is PALM + R, SQUARE is DRESS + R, NEAR is KIT + R.

Jamaican: TRAP = LOT (and therefore = BATH/PALM).

The different spellings of the sets typically reflect (though imperfectly) the history of the sounds in English. So if your LOT and THOUGHT are the same, historically the LOT words had a short O vowel, while the THOUGHT ones had a long vowel, represented by the spellings AU, AW, AUGH, OUGH. If your THOUGHT and NORTH vowels are the same, the presence of the R in the spelling (and the fact that well-known R-ful accents are still around) shows you which ones came from an original OR sound.

* Note on GOOSE and CURE: English behaves as if the group "you" (U, EU) is a single vowel even though it begins with a consonant Y. So 'beauty' and 'booty' feel to English-speakers as if they have different vowels, as do 'boor' and 'pure'. So GOOSE and CURE contain both kinds, and both sets could be split in two on this basis. Phonetically, the only difference is the initial Y-consonant, and it doesn't further affect the classification of vowels.

