How can I find the boundaries of a subset of a sorted list?
I have the following dilemma: I have a list of st开发者_运维问答rings, and I want to find the set of string which start with a certain prefix. The list is sorted, so the naive solution is this:
Perform binary search on the prefixes of the set, and when you find an element that starts with the prefix, traverse up linearly until you hit the top of the subset.
This runs in linear time, however, and I was wondering if anyone can suggest a more efficient way to do it.
Do a binary search for the top, and do a binary search for the bottom. Once you find the first hit, you know the top is above that point and the bottom is below (or at that point in both cases). Once you have the top and bottom you have the solution.
You can do a similar binary search for the top element, except that the string you should be looking for is the first string that starts with a prefix strictly greater than the prefix in question. This also takes O(lg n) time.
Once you find one element in the set, just keep binary searching until you find the endpoints.
精彩评论