All of this code is well and good but, as cadmos so eloquently hinted at, "Premature optimization is the root of all evil." (C.A.R. Hoare). Your clever trick could cost you when the code hits the core.

The cost might not be only an efficiency penalty, either: If you XOR two signed integers in C, the results are, strictly speaking, undefined. It could create a trap value that causes the machine to dump core, for instance. The simpler method using a temporary value is guaranteed to work, no matter what machine your code is heading for next.

And that's the point: The simpler, obvious method is simpler and obvious. It is more human-friendly, which would be a very good reason to prefer it even if the alternative was more machine-friendly. As Knuth maintains, code is meant to be read by humans first, and machines second. C is a notational convenience we layer over machine code, and as such it damned well should be convenient.