Find any one of multiple possible repeated integers in a list

2023-03-14 12:58 问答作者：

Given an array of n+1 integers, each in the range 1 to n, find an integer that is repeated.

I was asked this at a job interview. Here's my answer: The Pigeonhole Principle says there has to be a repeat. I tried to use a binary search approach, so I did this in Matlab, because that's what I know:

t开发者_如何学JAVAop = 0;
bot = 0;
for i=1:n+1
  if P[i] > n/2 
    top = top+1;
  else
    bot = bot+1;
end

So then I argue that one of these, top or bot, has to be bigger than n/2 by the PhP again. Take that range and repeat.

I thought this was a pretty good solution, but the interviewer sort of hinted that one can do better. Please post any better solutions you know of.

I'm not sure how you're defining "better", but perhaps this qualifies. At least it's different from your solution and the solutions to the linked list questions (pun intended).

If we make a path

n+1 --> array[n+1] --> array[array[n+1]] --> ...

then this path contains a cycle if and only if array^k[n+1] = array^l[n+1] for some k != l, that is, if and only if there is a repeat. The question now becomes a common linked list problem that can be solved as follows.

Start two particles on the first node. Let the first particle move at unit speed and let the second particle move at twice unit speed. Then if there is a cycle, the second particle will end up looping back behind the first, and eventually they'll be the same. Why? Well, if you think of the particles as on a circle (which they will be once the find the loop), at every time unit the second particle gets one directed step closer to the first. Therefore they must eventually collide. One they do, you've found a loop. To find the repeated value, simply get the length of the loop by letting one particle stand still while the other runs the loop again. Then start both particles at the start again, let one move the length of the loop ahead, and then run both particles together with constant distance between them until they meet again at the beginning of the loop.

As some commentators are outraged that I did not include all of the details of how to find a loop in a linked list, here it now is. No promises that this isn't buggy (it's Matlab-esque pseudocode after all), but it should at least explain the idea.

%% STEP 1: find a point in the cycle of the linked list using a slow and fast particle
slow = n+1;
fast = n+1;
for i=1 to n+1
    slow = array[slow];
    fast = array[array[fast]];
    if (slow == fast)
        break;
end

%% STEP 2: find the length of the cycle by holding one particle fixed
length = 1;
fast = array[fast]
while fast != slow
    fast = array[fast];
    length = length+1;
end

%% STEP 3: find the repeated element by maintaining constant distance between particles
slow = n+1;
fast = n+1;
for i=1 to length
    fast = array[fast];
end
while fast != slow
    fast = array[fast];
    slow = array[slow];
end

%% STEP 4: return the repeated entry
return slow;

Here I started at n+1 because array[i] is between 1 and n, so neither particle will ever be sent back to n+1. This makes at most one pass through the array (though not in order) and keeps track of two particles (slow and fast) and one integer (length). The space is therefore O(1) and the time is O(n).

If you know that there is exactly one number that is duplicate you can find it by summing all of them and subtracting the sum of numbers from 1 to n:

duplicate = sum P[i] - n(n+1)/2

If not, then you can iterate through the array and put each number in a hashtable. If the number already exists then that's the duplicate. This is also O(n) assuming the hashtable operations are O(1).

Or event better - to avoid the hashtable you can use an array of booleans of size n:

int[] P = new int[] { 3, 2, 5, 1, 4, 2 };
bool[] Q = new bool[6];

foreach( var p in P ){
    if ( Q[p] ) {
        Console.WriteLine("Duplicate: " + p);
        break;
    }
    Q[p] = true;
}

How about this simple solution:

start creating a binary search tree from the array. Whenever there is an element already present which is a duplicate while you are inserting into the BST then store that element in another array of duplicate elements and continue your loop .we don't even need to sort the array for finding the duplicates here.

This is just my idea.I was asked the same question in one of the interviews and this was my answer.

This works in a similar way as @PengOne's answer but it is simpler I believe.

Explanation:

This approach treats the array as a graph where value at index i points to index a[i]-1 (so value 1 points to index 0). There is at least 1 repeating number, so the graph will be cyclic. There are n+1 elements and the max is n, so the last node a[n+1] will never be a part of a cycle but will enter a cycle. This is important as this last node is the start node for traversal. Note that if a node which is part of a cycle is used as start node with slow (1x) and fast (2x) pointers then they meet at that same node itself which is not useful. Let's call the converging node as meet node. If the meet node is k hops away from the cycle node, the start node will also be k hops away from the cycle node. This logic is same as finding the cycle node in a cyclic linked list. The array is traversed a max of 3 times so O(n) time and O(1) space.

Find any one of multiple possible repeated integers in a list

Algo:

Starting from last node (a[n+1]), find the meet node using slow (1x) and fast (2x) pointers.
Advance two pointers (1x) from meet node and from start node and they will converge at the cycle node (The repeating numbers point to cycle node).

Code:

//pseudocode
//O(n) time, O(1) space
findrepeating(a):
    x = findrepeating(a, findmeet(a), a[a.length() -1])
    return x

findmeet(a):
    slow = fast = a[a.length() -1]
    while true:
        slow = a[slow-1]
        fast = a[a[fast-1]-1]
        if slow == fast:
            break
    meet = slow // or fast
    return meet

findrepeating(a, meet, start):
    m = meet
    s = start
    while m != s:
        m = a[m-1]
        s = a[s-1]
    return m // or s

We use circle detection's idea to solve this problem.

All we need to do is first find the begin of the circle and then find the duplicated one in the circle.

Here's the code in c++:

int findDuplicate(vector<int>& nums) {
    int slow = nums[0];
    int fast = nums[nums[0]];

    while(slow != fast){
        slow = nums[slow];
        fast = nums[nums[fast]];
    }
    fast = 0;
    while(slow != fast){
        slow = nums[slow];
        fast = nums[fast];
    }

    return slow;
}

for(int i=n+1;i!=a[i];i=a[i]);

cout<<i;

继续阅读：algorithm duplicates search

Find any one of multiple possible repeated integers in a list

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？