开发者

How to use KDTree to make top-k query and range query on arbitrary dimensions

I have used KD-tree(libkdtree++) to store开发者_开发技巧 a multi-dimensional data set, and the requirements here is this data set can support top-k/range queries on different dimensions. For example, a KDTree<3, Point> tree: to find the top 100 points whose have highest Point[1](y axis) values.

From the implementation of libkdtree++, what's similar is the "find_within_range" functions, however it is counted based on "Manhattan distance", which equals max(x_dist, max(y_dist, z_dist)) here. How can I just use range query on one dimension?


Looking at the code, it looks like you can't do that in a straightforward way, ridiculously enough. If I were you I'd be tempted to either hack the library or write my own kd-tree. I'd ask on their mailing list to be sure, but it looks like you might have to do something like this:

kdtreetype::_Region_ r(point_with_min_y);
r.set_low_bound(min_x, 0);
r.set_high_bound(max_x, 0);
r.set_low_bound(min_z, 2);
r.set_high_bound(max_z, 2);
r.set_high_bound((min_y + max_y) / 2, 1);

double search_min = min_y, search_max = max_y;

// binary search to get 100 points
int c;
while (c = tree.count_within_range(r) != 100) {
    if (c > 100) search_max = (search_min + search_max) / 2;
    else         search_min = (search_min + search_max) / 2;
    r.set_high_bound((search_min + search_max) / 2);
}

tree.visit_within_range(r, process_min_y_point);

This is a horribly inefficient binary search for the Y at which count(points with y <= Y) == 100. I'm not familiar with the library, but that's the best I've got on a cursory inspection.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜