开发者

How to handle null values during binary search?

What's the best way to go about handling nulls during a binary search over a List<string> (well, it would be a List<string> if I could read all the values out beforehand)?

int previous = 0;
int direction = -1;
if (itemToCompare == null) {
    previous = mid;

    for (int tries = 0; tries < 2; tries++) {
   开发者_如何学Go     mid += direction;
        itemToCompare = GetItem(mid);
        while (itemToCompare == null && insideInclusiveRange(min, max, mid)) {
            mid += direction;
            itemToCompare = GetItem(mid);
        }
        if (!insideInclusiveRange(min, max, mid)) {
            /* Reached an endpoint without finding anything,
                try the other direction. */
            mid = previous;
            direction = -direction;
        } else if (itemToCompare != null) {
            break;
        }
    }
}

I'm currently doing something like the above - if null is encountered, then linearly search in a direction until either non-null or beyond endpoint is encountered, if no success then repeat in other direction. In the actual code I'm getting direction from the previous comparison result, and GetItem() caches the values it retrieves. Is there an easier way, without making an intermediate list of non-null values (takes far too long for my purposes because the GetItem() function above is slow)?

I guess I'm asking if there's a smarter way to handle null values than to degrade to a linear search. In all likelihood there will only be a small percentage of nulls (1-5%), but it's possible for there to be sequences of 100s of null.

Edit - The data looks something like this

         aa         aaa
b        bb         bbb
c        cc
d                   ddd

where each row is a separate object, and not all cells are guaranteed to be filled. The user needs to be able to search across an entire row (so that both "bb" and "bbb" would match the entire second row). Querying each object is slow enough that a linear search will not work. For the same reason, creating a new list without nulls is not really feasible.


Unless there is a reason to actually select/find a null value (not sure what that means as null is a singleton and binary search is often most desirable on unique values), consider not allowing them in the list at all.


[Previous answer: After reflecting on the question more I have decided that nulls likely have no place in the problem-space -- take bits and parts as appropriate.]

If nulls are desired, just sort the list such that null values are first (or last) and update the logic correctly -- then just make sure not to invoke a method upon any of the null values ;-)

This should have little overall impact since a sort is already required. If items are changed to null -- which sounds like an icky side-effect! -- then just "compact" the List (e.g. "remove" the null item). I would, however, just not modify the sorted list unless there is a good reason.

Binary search is only really designed/suitable for (entirely) sorted data. No point turning it into a binary-maybe-linear search.

Happy coding.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜