I once gave the b &= b-1 solution in an interview, but my interviewer wanted a faster solution (which is not nearly as slick):

First, do some preprocessing. Create a lookup table (an array) that maps a number between 0 and 255 (the index) to the number of bits in the binary representation of that number (you can use b &= b-1). Then, for each byte in the unsigned int, use the byte as an index into the table, and sum the results. This runs in constant time, which is especially good if you have lots of bytes to count.