Soundex is woefully inadequate. It's meant to be phonetic but gets stuck on individual letters instead of the spelling. It could be improved by a few simple rules.

The letter numbering is assigned by saying B = 1, C = 2, D = 3, and so on, but assigning a similar-sounding letter to an existing group, so F = 1 because F is labial like B. The next consonant that doesn't sound much like any of the previous ones is L, so L = 4, then M = 5, then the contentious claim that R counts as a consonant and is different from the rest. So you get this list, with their phonetic basis:

1: B, F, P, V -- all labial
2a: C, G, K, Q, X -- all velar
2b: C, S, X, Z -- all sibilant
2c: C, G, J -- all palatal
3: D, T -- both alveolar
4: L -- lateral
5: M, N -- both nasal
6: R -- rhotic in some positions or in some dialects
Because C can be like K and Q in cat they all get lumped together; but because it can be like S in city so do they, with the result that the phonetically nonsensical K = S gets built in. Then CH as in church is like G as in gent, judge is like J; but these are not like the K set or the S set. This fudge caused by treating C just as a letter in isolation means too many sounds get lumped together.

Fix: Look at next letter. CE, CI, CY cause equation C = S; CH causes equation C = J; and anything else makes it C = K. These pick up the great majority of contexts correctly.

On the other hand, R gets treated as a consonant even when it's just part of a vowel in many accents, so identical-sounding names like Houghton and Horton don't get picked up. As genealogy is likely to be conducted on names used in England, this is a significant consideration. Fix: Have a dialect switch. If set, look at next letter. If it's a vowel, treat R as a consonant, else ignore it.

The omission of W shows that it's for English-language use. In most languages W = V would be more appropriate.