开发者

How to find first non-repeating element?

How to find first non-repeating element in an array. Provided that you can only use 1 bit for every element of开发者_StackOverflow社区 the array and time complexity should be O(n) where n is length of array. Please make sure that I somehow imposed constraint on memory requirements. It is also possible that it can not be done with just an extra bit per element of the string. Also please let me know if it is possible or not?


I would say there is no comparison based algorithm, that can do it in O(n). As you have to compare the the first element of the array with all others, the 2nd with all except the first, the 3rd with all except the first = Sum i = O(n^2).

(But that does not necessarily mean that there is no faster algorithm, see sorting: There is a proof that you cant sort fast than O(n log n) if you are comparison based - and there is indeed one faster: Bucket Sort, which can do it in O(n)).

EDIT: In one of the other comments I said something about hash functions. I checked some facts about it, and here are the hashmap approach thoughts:

  • Obvious approach is (in Pseudocode):

    for (i = 0; i < maxsize; i++)
        count[i] = 0;
    for (i = 0; i < maxsize; i++) {
       h = hash(A[i]);
       count[h]++;
    }
    first = -1;
    for (i = 0; i < maxsize; i++)
       if (count[i] == 0) {
          first = i;
          break;
       }
    }
    for (i = 0; hash(A[i]) != first; i++) ;
    printf("first unique: " + A[i]); 
    
  • There are some caveats:

    1. How to get hash. I did some research on perfect hash functions. And indeed you can generate one in O(n). (Optimal algorithms for minimal perfect hashing by George Havas et al. - Not sure how good this paper is, as it claims as Time Limit O(n) but speaks from non linear space limit (which is plan an error, I hope I am not the only seeing the flaw in the this, but according to all theorical computer science I know off time is an upper border for space (as you dont have time to write in more space)). But I believe them when they say it is possible in O(n).

    2. The additional space - here I dont see a solution. Above papers cites some research that says that you need 2.7 bits for the perfect hash function. With the additional count array (which you can shorten to the states: Empty + 1 Element + More than 1 Element) you need 2 additional bits per element (1.58 if you assume you can it somehow combine with the above 2.7), which sums up to additional 5 bits.


Here I'm just taking one assumption that the string is Character String, just containing small alphabets, so that I can use one Integer (32 bit) so that with 26 alphabets it will be sufficient to take one bit per alphabet. Earlier I thought to take an array of 256 elements but then it will have 256*32 bits in total. 32 bits per element. But finally I found that I will be unable to do it without one more variable. So the solution is like this with just one integer (32 bits) for 26 alphabets:

 int print_non_repeating(char* str)
 {
  int bitmap = 0, bitmap_check = 0;
  int length = strlen(str);
  for(int i=0;i<len;i++)
  {
   if(bitmap & 1<<(str[i] - 'a'))
     {
        bitmap_check = bitmap_check | ( 1 << (str[i] - 'a');
      }
   else 
      bitmap = bitmap | (1 << str[i] - 'a');
  }
  bitmap = bitmap ^ bitmap_check;
  i = 0;
  if(bitmap != 0)
  {
  while(!bitmap & (1<< (str[i])))
   i++;
  cout<<*(str+i);
   return 1;
  }
  else 
  return 0;
  }


You can try doing a modified bucketsort as exemplified below. However, you need to know the max value in the array passed into the firstNonRepeat method. So this runs at O(n). For comparison based methods, the theoretical fastest (at least in terms of sorting) is O(n log n). Alternatively, you can even use modified versions of radix sort to accomplish this.

public class BucketSort{
    //maxVal is the max value in the array
    public int firstNonRepeat(int[] a, int maxVal){
        int [] bucket=new int[maxVal+1];

        for (int i=0; i<bucket.length; i++){
            bucket[i]=0;
        }

        for (int i=0; i<a.length; i++){
            if(bucket[a[i]] == 0) {
                bucket[a[i]]++;             
            } else {
                return bucket[a[i]];
            }
        }
    }
}


This code finds the first repeating element. havent figured out yet if in the same for loop if it is possible to find the non-repeating element without introducing another for (to keep the code O(n)). Other answers suggest bubble sort which is O(n^2)

#include <iostream>
using namespace std;
#define max_size 10

int main()
{
    int numbers[max_size] = { 1, 2, 3, 4, 5, 1, 3, 4 ,2, 7};
    int table[max_size] = {0,0,0,0,0,0,0,0,0,0};
    int answer = 0, j=0;

  for (int i = 0; i < max_size; i++)
  {
    j = numbers[i] %max_size;
    table[j]++;
    if(table[j] >1)
    {
          answer = 1;
          break;
    }
 }
   std::cout << "answer = " << answer ;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜