How to handle null values during binary search?
What's the best way to go about handling null
s during a binary search over a List<string>
(well, it would be a List<string>
if I could read all the values out beforehand)?
int previous = 0;
int direction = -1;
if (itemToCompare == null) {
previous = mid;
for (int tries = 0; tries < 2; tries++) {
开发者_如何学Go mid += direction;
itemToCompare = GetItem(mid);
while (itemToCompare == null && insideInclusiveRange(min, max, mid)) {
mid += direction;
itemToCompare = GetItem(mid);
}
if (!insideInclusiveRange(min, max, mid)) {
/* Reached an endpoint without finding anything,
try the other direction. */
mid = previous;
direction = -direction;
} else if (itemToCompare != null) {
break;
}
}
}
I'm currently doing something like the above - if null is encountered, then linearly search in a direction until either non-null or beyond endpoint is encountered, if no success then repeat in other direction. In the actual code I'm getting direction
from the previous comparison result, and GetItem()
caches the values it retrieves. Is there an easier way, without making an intermediate list of non-null values (takes far too long for my purposes because the GetItem()
function above is slow)?
I guess I'm asking if there's a smarter way to handle null values than to degrade to a linear search. In all likelihood there will only be a small percentage of nulls (1-5%), but it's possible for there to be sequences of 100s of null.
Edit - The data looks something like this
aa aaa b bb bbb c cc d ddd
where each row is a separate object, and not all cells are guaranteed to be filled. The user needs to be able to search across an entire row (so that both "bb" and "bbb" would match the entire second row). Querying each object is slow enough that a linear search will not work. For the same reason, creating a new list without nulls is not really feasible.
Unless there is a reason to actually select/find a null
value (not sure what that means as null
is a singleton and binary search is often most desirable on unique values), consider not allowing them in the list at all.
[Previous answer: After reflecting on the question more I have decided that null
s likely have no place in the problem-space -- take bits and parts as appropriate.]
If nulls are desired, just sort the list such that null values are first (or last) and update the logic correctly -- then just make sure not to invoke a method upon any of the null
values ;-)
This should have little overall impact since a sort is already required. If items are changed to null
-- which sounds like an icky side-effect! -- then just "compact" the List (e.g. "remove" the null item). I would, however, just not modify the sorted list unless there is a good reason.
Binary search is only really designed/suitable for (entirely) sorted data. No point turning it into a binary-maybe-linear search.
Happy coding.
精彩评论