pl/sql code for nearest neighbor query without index for a table with point data in oracle

2023-02-22 10:41 问答作者：

i am trying to build a procedure to obtain k nearest neighbor points to a point with a selected ID. I need to do this without using any spatial locator features like sdo_geometry or nn.

basically i have a table in oracle with ID, Data_X, Data_Y. let's say i have 10 entries in my table, and i need the 3 nearest points to a fictional point target_x, target_y.

we would need to calculate the euclidean distance of each point in the table with my given fictional point. I just dont know an algorithm开发者_运维知识库 in pl/sql that would return me the nearest neighbor ids.

Calculate the distance (Pythagoras) between each point and the selected point and order by the distance. Pseudo sql:

select id from points
order by sqrt(sqr(Data_x - target_x) + sqr(Data_y - target_y))

The first 3 rows are the the nearest 3 points.

nang's answer is a great starting point, and if it does the job, I'd use it. Unfortunately it will most probably require a full table scan (or perhaps a full index scan, if you have a covering index).

If performance becomes a problem, you could probably have a look at making a poor-man's spatial index over the data. It won't be as simple as "create index" but it just might work.

The proper method would be to create a custom index, but that would just be reinventing the sdo_geometry wheel, a path you said you wish to avoid.

A simple-but-rough method (disclaimer: this is just an idea off the top of my head, not tested) might be to create a function-based index that groups all the points in 2D space into square blocks. You basically create an index to map each (x,y) pair onto a list of blocks. Each block would have a defined width and height, and to do a search you would first work out which grid of blocks need to be searched, then query across just the list of points in that grid.

An example index would be something like:

CREATE INDEX grid_block_i ON points (TRUNC(Data_X/100), TRUNC(Data_Y/100), id);

What value you substitute for 100 will depend on the range of values your points take. You will want to divide the plane into a large number of grid blocks, so that the index is reasonably selective; but not so large that a typical query will have to search too many blocks to find candidates.

You could use the index above by using a query like this:

select id
from (select id, Data_X, Data_Y
      from points
      where TRUNC(Data_X/100) BETWEEN TRUNC(:target_x/100)) - :threshold
                                  AND TRUNC(:target_x/100)) + :threshold
      and   TRUNC(Data_Y/100) BETWEEN TRUNC(:target_y/100)) - :threshold
                                  AND TRUNC(:target_y/100)) + :threshold
     )
order by sqrt(sqr(Data_x - :target_x) + sqr(Data_y - :target_y))

You can then set :threshold to basically eliminate a large set of blocks of points from the query. I reckon if the values for the functional index (i.e. 100) and the threshold are set correctly, you'll see the query use the function-based index to get a small set of candidates, instead of calculating the distance for every single point in the table.

The downside is that if :threshold is too low, the query might return no rows. On the other hand, that might be a useful feature, depending on your needs.

继续阅读：algorithm nearest-neighbor oracle plsql

pl/sql code for nearest neighbor query without index for a table with point data in oracle

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？