Find next lower item in a sorted list
let's say I have a sorted list of Floats. Now I'd like to get the index of the next lower item of a given value. The usual for-loop aprroach has a complexity of O(n). Since the list is sorted there must be a way to get the index with O(log n).
M开发者_JAVA技巧y O(n) approach:
index=0
for i,value in enumerate(mylist):
if value>compareValue:
index=i-1
Is there a datatype for solving that problem in O(log n)?
How about bisect?
>>> import bisect
>>> float_list = [1.0, 1.3, 2.3, 4.5]
>>> i = bisect.bisect_left(float_list, 2.5)
>>> index = i - 1
>>> index
2
You might have to handle the case of a search value less than or equal to the lowest / leftmost value in the list separately (index == -1
in this case).
Depending on which index you want to have in case of equality, you might have to use bisect_right
instead.
You can do a binary search on an array/list to get the index of the object you're looking for and get the index below it to get the lower entry (given that there actually is a lower entry!).
See: Binary search (bisection) in Python
Be careful when comparing floating point numbers for equality!
Use the bisect module. The function
bisect.bisect_left(mylist, compareValue)
returns the proper insertion point for item in list to maintain sorted order.
import bisect
def next_lower_value(values_list, input_value):
index= bisect.bisect_left(values_list, input_value)
if index == 0: # there's not a "next lower value"
raise NotImplementedError # you must decide what to do here
else:
return values_list[index - 1]
>>> l= [11, 15, 23, 28, 45, 63, 94]
>>> next_lower_value(l, 64)
63
>>> next_lower_value(l, 63)
45
>>> next_lower_value(l, 1000)
94
>>> next_lower_value(l, 1)
Traceback (most recent call last):
File "<pyshell#29>", line 1, in <module>
next_lower_value(l, 1)
File "<pyshell#26>", line 4, in next_lower_value
raise NotImplementedError # you must decide what to do here
NotImplementedError
Since you request the index and not the next lower value, change the function next_lower_value
to return index - 1
instead of values_list[index - 1]
.
To answer part of the question about datatypes: In a general sense, the datatype most appropriate for finding things in O(log n) time (while maintaining O(1) performance on inserts and deletes!) is the binary tree. You can find things in it by making a series of left-right decisions, which is very analogous to how you do a binary search in a linear list but is (IMO) a little more conceptually intuitive.
That said, from what little I know of Python, binary trees don't seem to be in the language's standard library. For your application, there would probably be no benefit to include an implementation just for this purpose.
Finally, both binary trees and binary search in a sorted list will allow you to shorten the search by one step: It isn't necessary to search for the key item and then move back to its predecessor. Instead, on every comparison step, if you encounter the key value, act as if it was too large. This will cause your search to end up on the next smaller value. Done carefully, this may also help with the "almost equal floating point value" problem mentioned by bart.
If I'm reading this right, the next lower item is the first item in the list that's less than or equal to x. The bisect documentation for searching sorted lists gives this function:
def find_le(a, x):
'Find rightmost value less than or equal to x'
i = bisect_right(a, x)
if i:
return a[i-1]
raise ValueError
def lower_bound(arr, x):
left = 0
right = len(arr)-1
mid = -1
if(arr[left] > x):
return mid
while(left <= right):
mid = int(left + (right - left + 1) / 2)
if(left == right and right == mid):
return mid
if(x > arr[mid]):
left = mid
elif(x < arr[mid]):
right = mid - 1
else:
return mid
return mid
This function returns the index of the element in the sorted list 'arr' if the exact element is found else it returns the index of the largest element smaller than the given number 'x'. If no element is smaller than the given number it returns -1.
精彩评论