search for before and after values in a long sorted list
What would be the fastest way to search for a number (eg. 12.31) in long sorted list and get 开发者_Python百科the values one before and after my "search" value when the exact value isn't found (eg. 11.12 and 12.03 in the list below)?
Many thanks in advance.long_list = [10.11, 11.12, 13.03, 14.2 .. 12345.67]
The fastest is probably to use built-in support in python. Here I'm thinking about the bisect module. Below I'm using a dictionary to quickly check in O(1) if a value is in the list; if not, bisect
is used to find values smaller than and larger than the sought value.
#!/usr/bin/env python
import bisect
def find_lt(a, x):
'Find rightmost value less than x'
i = bisect.bisect_left(a, x)
if i:
return a[i-1]
raise ValueError
def find_gt(a, x):
'Find leftmost value greater than x'
i = bisect.bisect_right(a, x)
if i != len(a):
return a[i]
raise ValueError
# First create a test-list (49996 items)
i=1.0
R=[1.0]
D={}
while i < 10000:
i+=0.2
i=round(i,2)
D[i]=True
R.append(i)
# Locate a value, in this case 100.3 which is not in the list
x=100.3
if D.has_key(x):
print "found", x
else:
print find_lt(R, x)
print find_gt(R, x)
Output for x=100.3
:
100.2
100.4
Exponential search (AKA galloping search) would perform better than plain binary search if the list is very long. The idea is to scan forward from position 0 on increasing steps until the answer is passed at this point a binary search can be performed to the range formed by the last two steps. If the element is not found then the last attempt will point to the closest elements.
Have a look at Basic Techniques for information retrieval. The pseudo-code algorithm is provided and they discuss its complexity against binary search.
If your list is sorted as in your example, I suppose a binary search would be fastest.
li = [10.11, 11.12, 13.03, 14.2, 15.6, 15.8, 17.9, 12345.67]
def searsh(x,li):
itli = iter(li)
a = itli.next()
if a==x:
return a
else:
while True:
b = itli.next()
if b==x:
return b
elif a<x<b:
return (a,b)
a = itli.next()
if a==x:
return a
elif b<x<a:
return (b,a)
print searsh(13.5,li)
print searsh(10.11,li)
print searsh(50.3,li)
print searsh(12345.67,li)
result
(13.03, 14.2)
10.11
(17.9, 12345.67)
12345.67
Also:
def searsh(x,li):
a = li[0]
if a==x:
return a
else:
j = 0
while True:
j += 1
b = li[j]
if b==x:
return b
elif a<x<b:
return (a,b)
j += 1
a = li[j]
if a==x:
return a
elif b<x<a:
return (b,a)
精彩评论