Array duplicates solution - Everything2.com

by jliszka

Fri Mar 09 2001 at 4:01:48

This is a solution to problem 8 on the hard interview questions node. If you have not read the question, the following will make no sense to you:

For the first part, compute the sum of the numbers from 1 to N-1. Then sum the values in the array. The duplicate number is the difference between these sums.

For the second part, compute the sum of the numbers from 1 to N+1. Then sum the values in the array. The missing number is again the difference between these sums.

I like it!

(idea)

by bis

Fri Mar 09 2001 at 6:13:13

addition and subtraction are fine, but real men¹ use xor...

See, the problem with summing the numbers from 1 to N-1, or 1 to N+1, is that if your list is, say, 1 to 1 billion, and you are dealing with 32-bit integers, you can add, add, add, right along, until you hit the ceiling of what a 32-bit integer can store, then POW! Like a clogged toilet when you try to flush it, overflow. And overflow is bad.

Ok, so overflow is bad. How do we avoid it? Well, you could start playing games like, "add the first element in the list, then subtract 1. Add the second element in the list, and subtract 2. Add ....", and so on. The problem is, no matter what algorithm you use, it is always possible to construct a case that results in over- (or under-) flow.

... Or, you could use the magical xor. If you can swap with it, why shouldn't you be able to find missing list elements with it? Well, you can, and it's no more complex to do that the addition/subtraction solution.

Step 1: XOR together all the numbers in the list.
Step 2: XOR that result with the XOR of all the numbers that you expect to be in the list. (1 to N-1, or 1 to N+1, depending on the problem.)
Step 3: there is no step 3; what you're left with is your answer.

How does it work?

Well, N xor N is 0, 0 xor N, and xor (the operation) is commutative, so after you xor together the range 1 to N-1 or N+1 and the list, you get a whole bunch of zeros, xored with the missing (or duplicate) element, you're done, and you haven't overflowed.

¹or women.. whatever.

I like it!

(idea)

by cypherpunks3

Fri Apr 25 2008 at 20:02:51

That's very neat. I'm not sure why it'd be a problem with overflow though. Just add the numbers mod m, where m is greater than the largest number, e.g. 2^32. For the first do sum(array) - sum(1..n-1), for the second do sum(1..n+1) - sum(array). Since e.g. the integers modulu n under addition is an abelian group, cancellation works as expected.

A less elegant, but completely different way that I thought of first would be to "attempt" a sort of count-sort of the numbers. When doing this for the first problem, at some point you're going to find the duplicate element. In Python (for 0..n-1):

def find_extra(a):
   for i in range(len(a)):
      while a[i] != i:
         j = a[i]
         a[i], a[j] = a[j], a[i] # swap
         if a[i] < i:
            return a[i]

The second problem could be sledge-hammered into a similar solution:

def find_missing(a):
   lastp = len(a)
   for i in range(len(a)):
      while a[i] != i:
         j = a[i]
         if a[i] == len(a):
            lastp = i
            break
         a[i], a[j] = a[j], a[i] # swap
   return lastp

I like it!

hard interview questions	XOR swap	Linked Lists in Visual Basic	XOR
stack underflow	abelian group	Copy Cart	overflow
Intruso	You can find a place inside my heart if you will stay	Monotonic matrix solution	NxM chocolate bar solution
lost in the underflow	Commutative	Addition	subtraction
PDA