Efficient algorithm for searching 3D coordinate in an array
I have a large array (>10^5 entries) of 3D coordinates r=(x, y, z), where x, y and z are floats. Which is the most efficient way to search a given coordinate r' in the array and give the array index. Note that the r' may not given with the same accuracy as r; say, if the array has stored coordinate (1.5, 0.5, 0.0) and r' is given as (1.49999, 0.49999, 0.0), the a开发者_JAVA技巧lgorithm should rightly pick the coordinate. I am developing the code in C.
How can one use O(1) search capability of hash table for this purpose? Converting the coordinate into string is out of question due to accuracy related issue. Is there any particular data structure that would help in O(1) algorithm?
Thanks
OnRoadCoder
check R-trees, already implemented on some RDBMS, like SQLite, and (i think) Postgres
In order to have "fuzzy" searching as you're describing (so you can support slight inaccuracies), you will have to sacrifice on O(1) algorithms.
That being said, there are some very good algorithms for this. Space partitioning (such as using an Octree or KD-Tree) is a common, popular option.
If the range of values is limited, pick the precision you want. Now, the key (1,2,3) will point to a linked list (or a fancier data structure) of all points that are within Manhattan Distance of 3 * d (d = 0.5? - depends on details) from (1,2,3). You know your application best, so you can do a better job of choosing d. Optimization approach would depend on how the data is distributed.
EDIT:
The weakness here is - if you have many points concentrated within a single cube, then there is little that can be done using a hash table about guaranteeing O(1) ... more like O(n) :)
Some sort of tree-based data structure can guaranteed O(log n).
What you are asking sounds like Nearest Neighbour Search. One approach might be to code a kd-tree (or any space partition based technique) and use that to find the nearest point to your query. But you can also go with a hash based approach, which basically does what Ipthnc's answer describes, but tries to avoid bad performance for degenerate cases.
精彩评论