binary search middle value calculation

2023-02-01 10:47 问答作者：

The following is the pseudocode I got from a TopCoder tutorial about binary search

binary_search(A, target):
   lo = 1, hi = size(A)
   while lo <= hi:
      mid = lo + (hi-lo)/2
      if A[mid] == target:
         return mid            
      else if A[mid] < target: 
         lo = mid+1
      else:
         hi = mid-1

   // target was not found

Why do we calculate the middle value as mid = lo + (hi - lo) / 2 ? Whats wrong with (hi + lo) / 2

I have a slight idea that it might be to prevent overflows but I'm not sure, perhaps someone can explain it to me and if there are开发者_运维技巧 other reasons behind this.

Although this question is 5 years old, but there is a great article in googleblog which explains the problem and the solution in detail which is worth to share.

It's needed to mention that in current implementation of binary search in Java mid = lo + (hi - lo) / 2 calculation is not used, instead the faster and more clear alternative is used with zero fill right shift operator

int mid = (low + high) >>> 1;

Yes, (hi + lo) / 2 may overflow. This was an actual bug in Java binary search implementation.

No, there are no other reasons for this.

From later on in the same tutorial:

"You may also wonder as to why mid is calculated using mid = lo + (hi-lo)/2 instead of the usual mid = (lo+hi)/2. This is to avoid another potential rounding bug: in the first case, we want the division to always round down, towards the lower bound. But division truncates, so when lo+hi would be negative, it would start rounding towards the higher bound. Coding the calculation this way ensures that the number divided is always positive and hence always rounds as we want it to. Although the bug doesn't surface when the search space consists only of positive integers or real numbers, I've decided to code it this way throughout the article for consistency."

It is indeed possible for (hi+lo) to overflow integer. In the improved version, it may seem that subtracting lo from hi and then adding it again is pointless, but there is a reason: performing this operation will not overflow integer and it will result in a number with the same parity as hi+lo, so that the remainder of (hi+lo)/2 will be the same as (hi-lo)/2. lo can then be safely added after the division to reach the same result.

Let us assume that the array we're searching in, is of length INT_MAX. Hence initially:

high = INT_MAX 
low = 0

In the first iteration, we notice that the target element is greater than the middle element and so we shift the start index to mid as

low = mid + 1

In the next iteration, when mid is calculated, it is calculated as (high + low)/2 which essentially translates to INT_MAX + low(which is half of INT_MAX + 1) / 2

Now, the first part of this operation i.e. (high + low) would lead to an overflow since we're going over the max Int range i.e. INT_MAX

Because Unsigned right shift is not present in Go programming, To avoid integer overflow while calculating middle value in Go Programming language we can write like this.

mid := int(uint(lo+hi) >> 1)

Why question is answered but it is not easy to understand why solution works. So let's assume 10 is high and 5 is low. Assume 10 is highest value integer can have ( 10+1 will cause overflow ).

So instead of doing (10+5)/2 ≈ 7 ( because 10 + anything will lead overflow).

We do 5+(10-5)/2=> 5 + 2.5 ≈ 7

继续阅读：binary-search

binary search middle value calculation

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？