开发者

Find the array interval with most values present in the array

Example Sorted Position vector [4, 5, 9, 30, 31, 32, 34, 40, 47] Interval length = 6

I would like to find the maximum number of values in any given interval of length 6. In the above mentioned example, the intervals will be [Array value - Array value+Interval length : #Values present in this array]

  • 4 - 10 : 3(4,5,9)
  • 5 - 11 : 2(5,9)
  • 9 - 15 : 1(9)
  • 30 - 36 : 4(30, 31, 32, 34)
  • 31 - 37 : 3
  • 32 - 38 : 2
  • 34 - 40 : 2
  • 40 - 46 : 1
  • 47 - 53 : 0

Answer should be 30-36 as it has max number of values which is 4. 开发者_Go百科Anybody has any good ideas on how to find this optimally?


I don't see how you can do this in under O(n.k) time, where n is the number of items in the array and k is the 'interval length'.

  • For each element in n, you need to check the 'interval' for up to the next k elements.

I considered pre-processing the array to get a matrix of interval values, but that would take the same complexity as the basic naive algorithm.


This is the fastest way I can think of doing it. Also, a note: you defined the interval to be 6, but then said it contained items 30 through 36 (thats 7 items), so I am writing this code based on that:

int GetInterval(const vector<int> &sortedList, int intervalLength)
{
  int lowest = sortedList[0];
  int highest = sortedList[sortedList.size()-1];
  vector<int> intervals(highest-lowest-intervalLength+1, 0);
  int max = 0;

  for(int i = 0; i < sortedList.size(); i++)
  {
    int base = sortedList[i] - lowest;
    for(int j = -intervalLength; j <= intervalLength; j++)
    {
      int index = i + j + base;
      if(index < 0 || index >= intervals.size()) continue;
      if(++intervals[index] > intervals[max]) max = index;
    }
  }

  return max + lowest;
}

So, I haven't actually ran this, but it should work. Its big-O is the length of the sorted list times the interval length. If you through out the interval length as a constant, then it is O(n). Hope this works for you. Also, I hope c++ works for you as well.

*Note it returns the lowest number in the interval.


This scans the list in reverse to eliminate some backtracking.

size_t find_interval(const std::vector& _positions, int _interval)
{
    assert(_positions.size() >= 2);

    size_t start_of_run = 0;
    size_t end_of_run;
    size_t idx;
    size_t run_length = 0;

    end_of_run = _positions.size() - 1;
    idx = end_of_run - 1;
    for(; idx >= 0; --idx)
    {
        if((_positions[end_of_run] - _positions[idx]) <= _interval)
        {
            // idx is still in the interval, see if it's the longest
            // run so far
            if((end_of_run - idx) > run_length)
            {
                start_of_run = idx;
                run_length = end_of_run - idx;
            }
        }
        else
        {
            // idx is out of the interval, so move end_of_run down to
            // make a new valid interval
            do
            {
                --end_of_run;
            } while((_positions[end_of_run] - _positions[idx]) > _interval);

            // we don't care about a run smaller than run_length, so move
            // idx to the shortest interesting run
            idx = end_of_run - run_length;
        }
    }

    return start_of_run;
}

The end_of_run variable could be updated more efficiently with a binary search between idx and end_of_run, but it makes the algorithm harder to understand.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜