开发者

Algorithm question. I don't understand sentence quoted. Please help me out

Question from Programming Pearls, 2nd edition:

Given a sequential file containing 4,300,000,000 32-bit integers, how can you find one that appears at least twice?

Solution provided in the book:

Binary search find an element that occurs at least twice by recursively searching the subinterval that contains more than half of the integers. My original solution did not guarantee that the number of integers is halved in each iteration, so the worst case run time of its log2 n passes was proportional to n·log n. Jim Saxe reduced that to linear time by observing that the search can avoid carrying too many duplicates.

When his search knows that a duplicate must be in a current range of m integers, it will only store m+1 integers on its current work tape;

If more integers would have gone on the tape, his program discards them. Although his method frequently ignores input variables, its strategy is conservative e开发者_如何学运维nough to ensure that it finds at least one duplicate.


Above is content from the book. I don't understand the sentences quoted. How exactly can it be implemented? I mean, how can he know that "a duplicate must be in the current range of m integers"?

Thanks for your help!


2 ^ 32 = 4,294,967,296. You have a file with 4,300,000,000 integers, guaranteeing duplicates.

  1. Find the central number, it should 2^31 = 2147483648. If it is less, the duplicates are most probably in the second half. If not, duplicates have occurred in the first half.

  2. Find the central number again, it should 2^30 = 1073741824...

Repeat until you find the duplicate.


I think it refers to the pigeonhole principle, that if the difference between the minimum and the maximum element in a set is less than the cardinality of the set, there must be a duplicate.

And you can check this as you're building your subsets and stop building them as soon as you're certain a duplicate has to exist in that subset.


Wow. I think that book might be a bit old. This is a basic binary search problem.

And I think the book is kind of awkward. Maybe try wikipedia

http://en.wikipedia.org/wiki/Binary_search_algorithm

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜