Which is the fastest way to search a value within a set of many "range" objects in Python

2023-04-09 17:27 问答作者：

I have a list of many Python objects like this:

class RangeClass(object):

    def __init__(self,address,size):
        self.address=address
      开发者_如何学编程  self.size=size
        #other attributes and methods...

Then, I have a list (rangelist) of RangeClass objects.

I need to find within which range a given value is.

I can use some code like this:

for r in ragelist:
    if(value>=r.address and value<(r.address+r.size)):
        return r
return None

But I think there is a faster way. Ranges have arbitrary size, but we can assume that they don't overlap.

Thank you.

If you have many values to test, then you could use the bisect module to find which range the values are in more quickly.

m = the number of values to test, and
n = len(rangelist)

then looping through the values and rangelist as you suggest would take O(m*n) time.

If you use bisection, then you must first sort the starting addresses O(nlogn) and find each value's place in rangelist O(m*logn). So if

O(nlogn + m*logn) < O(m*n)

then bisection wins. For large n, O(m*logn) is miniscule compared to O(m*n). So the inequality above would be true if

O(nlogn) < O(m*n)

or equivalently, when

C log(n) < m

for some constant C.

Thus, when n is large and C log(n) < m, you might try something like

import bisect

class RangeClass(object):

    def __init__(self,address,size):
        self.address=address
        self.size=size
    def __str__(self):
        return '[{0},{1})'.format(self.address,self.address+self.size)
    def __lt__(self,other):
        return self.address<other.address

rangelist=sorted([RangeClass(i,1) for i in (1,3,4,5,7.5)])
starts=[r.address for r in rangelist]

def find_range(value):
    start_idx=bisect.bisect(starts,value)-1
    try:
        r=rangelist[start_idx]
    except IndexError:
        return None
    start=r.address
    end=r.address+r.size
    if start<=value<end:
        return rangelist[start_idx]
    return None    

print(','.join(str(r) for r in rangelist))

for value in (0,1,1.5,2,3,4,5,6,7,8,9,10):
    r=find_range(value)
    if r:
        print('{v} in {r}'.format(v=value,r=str(r)))
    else:
        print('{v} not in any range'.format(v=value))

Not really. All you can do is take advantage of Python's relational operator chaining.

if r.address <= value < (r.address + r.size):

You could also define __contains__ on RangeClass to allow you to use in to find it instead.

class RangeClass(object):
   ...
  def __contains__(self, val):
    return self.address <= val < (self.address + self.size)

 ...
  if value in r:

Implement the comparison operator in Range, have the range list sorted, and use bisect to search for the range a value belongs in:

import bisect
def find_range(value):
    index = bisect.bisect(rangelist, value)
    if index not in (0, len(rangelist)):
        index -= 1
    return rangelist[index]

Thank you to everyone,

I am actually using the method proposed by unutbu.

Moreover, I add another optimization:

if(value <= first_range_value or value >= last_range_value):
    return None

Where first_range_value and last_range_value have been computed before, and they are the smallest r.address value and the largest r.address+r.size value .

This is worthing in my application, but it really depends on the distribution of ranges and values.

继续阅读：list optimization performance python search

Which is the fastest way to search a value within a set of many "range" objects in Python

更多精彩内容

精彩评论

最新问答

优酷会员投屏还得另收费!监管部门最新回应,具体是如何回应的?？

乐果f5音响怎样恢复出厂设置？

小霸王Q900好Q1000哪个好用？

任天堂switch忘记密码如何办?？

“优酷回应投屏收费”上热搜,视频网站如何稳健发展?？

问答排行榜

Escaping "<" in Perl-generated XML

微信重新建群怎么建？

imessage会显示已读吗？

太快了能不能慢一点好爽~好大~不要拔出来了？

二年级家长回音怎么写大全简短的（二年级家长回音怎么写）？