Find the array interval with most values present in the array
Example Sorted Position vector [4, 5, 9, 30, 31, 32, 34, 40, 47] Interval length = 6
I would like to find the maximum number of values in any given interval of length 6. In the above mentioned example, the intervals will be [Array value - Array value+Interval length : #Values present in this array]
- 4 - 10 : 3(4,5,9)
- 5 - 11 : 2(5,9)
- 9 - 15 : 1(9)
- 30 - 36 : 4(30, 31, 32, 34)
- 31 - 37 : 3
- 32 - 38 : 2
- 34 - 40 : 2
- 40 - 46 : 1
- 47 - 53 : 0
Answer should be 30-36 as it has max number of values which is 4. 开发者_Go百科Anybody has any good ideas on how to find this optimally?
I don't see how you can do this in under O(n.k) time, where n is the number of items in the array and k is the 'interval length'.
- For each element in n, you need to check the 'interval' for up to the next k elements.
I considered pre-processing the array to get a matrix of interval values, but that would take the same complexity as the basic naive algorithm.
This is the fastest way I can think of doing it. Also, a note: you defined the interval to be 6, but then said it contained items 30 through 36 (thats 7 items), so I am writing this code based on that:
int GetInterval(const vector<int> &sortedList, int intervalLength)
{
int lowest = sortedList[0];
int highest = sortedList[sortedList.size()-1];
vector<int> intervals(highest-lowest-intervalLength+1, 0);
int max = 0;
for(int i = 0; i < sortedList.size(); i++)
{
int base = sortedList[i] - lowest;
for(int j = -intervalLength; j <= intervalLength; j++)
{
int index = i + j + base;
if(index < 0 || index >= intervals.size()) continue;
if(++intervals[index] > intervals[max]) max = index;
}
}
return max + lowest;
}
So, I haven't actually ran this, but it should work. Its big-O is the length of the sorted list times the interval length. If you through out the interval length as a constant, then it is O(n). Hope this works for you. Also, I hope c++ works for you as well.
*Note it returns the lowest number in the interval.
This scans the list in reverse to eliminate some backtracking.
size_t find_interval(const std::vector& _positions, int _interval)
{
assert(_positions.size() >= 2);
size_t start_of_run = 0;
size_t end_of_run;
size_t idx;
size_t run_length = 0;
end_of_run = _positions.size() - 1;
idx = end_of_run - 1;
for(; idx >= 0; --idx)
{
if((_positions[end_of_run] - _positions[idx]) <= _interval)
{
// idx is still in the interval, see if it's the longest
// run so far
if((end_of_run - idx) > run_length)
{
start_of_run = idx;
run_length = end_of_run - idx;
}
}
else
{
// idx is out of the interval, so move end_of_run down to
// make a new valid interval
do
{
--end_of_run;
} while((_positions[end_of_run] - _positions[idx]) > _interval);
// we don't care about a run smaller than run_length, so move
// idx to the shortest interesting run
idx = end_of_run - run_length;
}
}
return start_of_run;
}
The end_of_run
variable could be updated more efficiently with a binary search between idx
and end_of_run
, but it makes the algorithm harder to understand.
精彩评论