开发者

Finding an element in an array where every element is repeated odd number of times (but more than single occurrence) and only one appears once

You have an array in which every number is repeated odd number of times (but more than single occurrence). Exactly开发者_如何学Python one number appears once. How do you find the number that appears only once?

e.g.: {1, 6, 3, 1, 1, 6, 6, 9, 3, 3, 3, 3}

The answer is 9.

I was thinking about having a hash table and then just counting the element whose count is 1. This seems trivial and i am not using the fact that every other element is repeated an odd no of times. Is there a better approach.


I believe you can still use the basic idea of XOR to solve this problem in a clever fashion.

First, let's change the problem so that one number appears once and all other numbers appear three times.


Algorithm:

Here A is the array of length n:

   int ones = 0;  
   int twos = 0;  
   int not_threes, x;  

   for (int i=0; i<n; ++i) {  
            x =  A[i];  
            twos |= ones & x;  
            ones ^= x;  
            not_threes = ~(ones & twos);  
            ones &= not_threes;  
            twos &= not_threes;  
   }  

And the element that occurs precisely once is stored in ones. This uses O(n) time and O(1) space.

I believe I can extend this idea to the general case of the problem, but possibly one of you can do it faster, so I'll leave this for now and edit it when and if I can generalize the solution.


Explanation:

If the problem were this: "one element appears once, all others an even number of times", then the solution would be to XOR the elements. The reason is that x^x = 0, so all the paired elements would vanish leaving only the lonely element. If we tried the same tactic here, we would be left with the XOR of distinct elements, which is not what we want.

Instead, the algorithm above does the following:

  • ones is the XOR of all elements that have appeared exactly once so far
  • twos is the XOR of all elements that have appeared exactly twice so far

Each time we take x to be the next element in the array, there are three cases:

  1. if this is the first time x has appeared, it is XORed into ones
  2. if this is the second time x has appeared, it is taken out of ones (by XORing it again) and XORed into twos
  3. if this is the third time x has appeared, it is taken out of ones and twos.

Therefore, in the end, ones will be the XOR of just one element, the lonely element that is not repeated. There are 5 lines of code that we need to look at to see why this works: the five after x = A[i].

If this is the first time x has appeared, then ones&x=ones so twos remains unchanged. The line ones ^= x; XORs x with ones as claimed. Therefore x is in exactly one of ones and twos, so nothing happens in the last three lines to either ones or twos.

If this is the second time x has appeared, then ones already has x (by the explanation above), so now twos gets it with the line twos |= ones & x;. Also, since ones has x, the line ones ^= x; removes x from ones (because x^x=0). Once again, the last three lines do nothing since exactly one of ones and twos now has x.

If this is the third time x has appeared, then ones does not have x but twos does. So the first line let's twos keep x and the second adds x to ones. Now, since both ones and twos have x, the last three lines remove x from both.


Generalization:

If some numbers appear 5 times, then this algorithm still works. This is because the 4th time x appears, it is in neither ones nor twos. The first two lines then add x to ones and not twos and the last three lines do nothing. The 5th time x appears, it is in ones but not twos. The first line adds it to twos, the second removed it from ones, and the last three lines do nothing.

The problem is that the 6th time x appears, it is taken from ones and twos, so it gets added back to ones on the 7th pass. I'm trying to think of a clever way to prevent this, but so far I'm coming up empty.


For the problem as stated it is most likely that the most efficient answer is the O(n) space answer. On the other hand, if we narrow the problem to be "All numbers appear n times except for one which only appears once" or even "All numbers appear a multiple of n times except for one which only appears once" then there's a fairly straightforward solution for any n (greater than 1, obviously) which takes only O(1) space, which is to break each number into bits and then count how many times each bit is turned on and take that modulo n. If the result is 1, then it should be turned on in the answer. If it is 0, then it should be turned off. (Any other answer shows that the parameters of the problem did not hold). If we examine the situation where n is 2, we can see that using XOR does exactly this (bitwise addition modulo 2). We're just generalizing things to do bitwise addition modulo n for other values of n.

This, by the way, is what the other answer for n=3 does, it's actually just a complex way of doing bit-wise addition where it stores a 2-bit count for each bit. The int called "ones" contains the ones bit of the count and the int called "twos" contains the twos bit of the count. The int not_threes is used to set both bits back to zero when the count reaches 3, thus counting the bits modulo 3 rather than normally (which would be modulo 4 since the bits would wrap around). The easiest way to understand his code is as a 2-bit accumulator with an extra part to make it work modulo 3.

So, for the case of all numbers appearing a multiple of 3 times except the one unique number, we can write the following code for 32 bit integers:

int findUnique(int A[], int size) {
  // First we set up a bit vector and initialize it to 0.
  int count[32];
  for (int j=0;j<32;j++) {
    count[j] = 0;
  }

  // Then we go through each number in the list.
  for (int i=0;i<size;i++) {
    int x = A[i];

    // And for each number we go through its bits one by one.
    for (int j=0;j<32;j++) {
      // We add the bit to the total.
      count[j] += x & 1;
      // And then we take it modulo 3.
      count[j] %= 3;
      x >>= 1;
    }
  }

  // Then we just have to reassemble the answer by putting together any
  // bits which didn't appear a multiple of 3 times.
  int answer = 0;
  for (int j=31;j>=0;j--) {
    answer <<= 1;
    if (count[j] == 1) {
      answer |= 1;
    }
  }

  return answer;
}

This code is slightly longer than the other answer (and superficially looks more complex due to the additional loops, but they're each constant time), but is hopefully easier to understand. Obviously, we could decrease the memory space by packing the bits more densely since we never use more than two of them for any number in count. But I haven't bothered to do that since it has no effect on the asymptotic complexity.

If we wish to change the parameters of the problem so that instead the numbers are repeated 5 times, we just change the 3s to 5s. Or we can do likewise for 7, 11, 137, 727, or any other number (including even numbers). But instead of using the actual number, we can use any factor of it, so for 9, we could just leave it as 3, and for even numbers we can just use 2 (and hence just use xor).

However, there is no general bit-counting based solution for the original problem where a number can be repeated any odd number of times. This is because even if we count the bits exactly without using modulo, when we look at a particular bit, we simply can't know whether the 9 times it appears represents 3 + 3 + 3 or 1 + 3 + 5. If it was turned on in three different numbers which each appeared three times, then it should be turned off in our answer. If it was turned on in a number which appeared once, a number which appeared three times, and a number which appeared five times, then it should be turned on in our answer. But with just the count of the bits, it's impossible for us to know this.

This is why the other answer doesn't generalize and the clever idea to handle the special cases is not going to materialize: any scheme based on looking at things bit by bit to figure out which bits should be turned on or off does not generalize. Given this, I don't think that any scheme which takes space O(1) works for the general case. It is possible that there are clever schemes which use O(lg n) space or so forth, but I would doubt it. I think that the O(n) space approach is probably the best which can be done in the problem as proposed. I can't prove it, but at this point, it's what my gut tells me and I hope that I've at least convinced you that small tweaks to the "even number" technique are not going to cut it.


I know that the subtext of this question is to find an efficient or performant solution, but I think that the simplest, readable code counts for a lot and in most cases it is more than sufficient.

So how about this:

var value = (new [] { 1, 6, 3, 1, 1, 6, 6, 9, 3, 3, 3, 3, })
    .ToLookup(x => x)
    .Where(xs => xs.Count() == 1)
    .First()
    .Key;

Good old LINQ. :-)


Test Score 100% with c#

using System;
using System.Collections.Generic;
// you can also use other imports, for example:
// using System.Collections.Generic;

// you can write to stdout for debugging purposes, e.g.
// Console.WriteLine("this is a debug message");

class Solution {
    public int solution(int[] A) {

        Dictionary<int, int> dic =new Dictionary<int, int>();

        foreach(int i in A)
        {
            if (dic.ContainsKey(i))
            {
                dic[i]=dic[i]+1;                
            }
            else
            {
                dic.Add(i, 1);                
            }
        }

        foreach(var d in dic)
        {
            if (d.Value%2==1)
            {
                 return d.Key;
            }
        }

        return -1;
    }
}


Java, Correctness 100%, Performance 100%, Task score 100%

// you can also use imports, for example:
// import java.util.*;

// you can write to stdout for debugging purposes, e.g.
// System.out.println("this is a debug message");
import java.util.HashMap;

class Solution {

     /*Simple solution 
          Will be using HashMap(for performance) as Array, 
          only Key set is needed.
          Initially tried with ArryList but there was performance issue with
          that so switch to HashMap.
          Iterate over the given array and if item is there in key set         
          remove it(coz you found your pair) otherwise add as new Key.
          After full Iteration only one key will be there in the Map that is 
          your solution.
          In Short: If pair is found, remove it from Map's Key set,
          Last Key will be your solution
    */
    public int solution(int[] A) {

        //Map but used as Array
        final HashMap<Integer, Boolean> mapHave = new HashMap<>();

        //Iterate over given Array
        for (int nIdx = 0; nIdx < A.length; nIdx++) {

            //Current Item
            Integer nVal = A[nIdx];

            //Try to remove the current item, if item does not exists will
            //return null and if condition will be true
            if (mapHave.remove(nVal) == null) {
                //current item not found, add it
                mapHave.put(nVal, Boolean.TRUE);
            }
        }

        //There will be only one key remaining in the Map, that is
        //your solution
        return mapHave.keySet().iterator().next();
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜