Find closest point of every point (Nearest Neighbor)

2023-02-14 23:18 问答作者：

I am writing a method that takes as input an array of points and finds, for each point in the array, the closest point to it other than itself. I am currently doing this in a brute force way (cheking every point with every other point). My current implimentation doesn't have the array sorted but i can sort it by p.x values with开发者_Python百科 the CompareByX method. I am chekcking the running time of the algorithm, and it gets very time consuming with large values of n. I am not very knowledgable on this subject and know very littel about different types of data structures, any simple help would be great!

My current code is:

import java.util.*;
import java.lang.*;
import java.io.*;

class My2dPoint {
  double x;
  double y;

  public My2dPoint(double x1, double y1) {
    x=x1;
    y=y1;
  }

}


class CompareByX implements Comparator<My2dPoint> {
    public int compare(My2dPoint p1, My2dPoint p2) {
    if (p1.x < p2.x) return -1;
        if (p1.x == p2.x) return 0;
        return 1;
    }
}

    /* An object of the above comparator class is used by java.util.Arrays.sort() in main to sort an array of points by x-coordinates */

class Auxiliaries {

    public static double distSquared(My2dPoint p1, My2dPoint p2) {
        double result;
        result = (p1.x-p2.x)*(p1.x-p2.x) + (p1.y-p2.y)*(p1.y-p2.y);
        return result;
    }

}

public class HW3 {
    public static void main (String argv []) throws IOException {
        int range = 1000000; // Range of x and y coordinates in points

        System.out.println("Enter the number of points");

        InputStreamReader reader1 = new InputStreamReader(System.in);
        BufferedReader buffer1 = new BufferedReader(reader1);
        String npoints = buffer1.readLine();
        int numpoints = Integer.parseInt(npoints);

        // numpoints is now the number of points we wish to generate

        My2dPoint inputpoints [] = new My2dPoint [numpoints];

        // array to hold points

        int closest [] = new int [numpoints];

        // array to record soln; closest[i] is index of point closest to i'th

        int px, py;
        double dx, dy, dist;
        int i,j;
        double currbest;
        int closestPointIndex;
        long tStart, tEnd;

        for (i = 0; i < numpoints; i++) {

          px = (int) ( range * Math.random());
          dx = (double) px;
          py = (int) (range * Math.random());
          dy = (double) py;
          inputpoints[i] = new My2dPoint(dx, dy);

        }

        // array inputpoints has now been filled



        tStart = System.currentTimeMillis();

        // find closest [0]


        closest[0] = 1;
        currbest = Auxiliaries.distSquared(inputpoints[0],inputpoints[1]);
        for (j = 2; j < numpoints; j++) {
           dist = Auxiliaries.distSquared(inputpoints[0],inputpoints[j]);
           if (dist < currbest) {
               closest[0] = j;
               currbest = dist;
           }
        }

        // now find closest[i] for every other i 

        for (i = 1; i < numpoints; i++) {
            closest[i] = 0;
            currbest = Auxiliaries.distSquared(inputpoints[i],inputpoints[0]);
            for (j = 1; j < i; j++) {
              dist = Auxiliaries.distSquared(inputpoints[i],inputpoints[j]);
              if (dist < currbest) {
               closest[i] = j;
               currbest = dist;
          }
            }

            for (j = i+1; j < numpoints; j++) {
              dist = Auxiliaries.distSquared(inputpoints[i],inputpoints[j]);
              if (dist < currbest) {
          closest[i] = j;
                  currbest = dist;
          }
            }
        }

        tEnd = System.currentTimeMillis();
        System.out.println("Time taken in Milliseconds: " + (tEnd - tStart));
    }
}

Brute force for nearest neighbour search is only feasible for a small number of points.

You might want to look into kd-Trees or spatial data structures generally.

Here is a demo for kd-Tree. This is what wikipedia says.

I would definitely sort by x first. Then I would use the x distance between points as a quick reject test: once you have the distance to one neighbor, any closer neighbor has to be closer in x. This avoids all the distSquared computations for points outside the x range. Every time you find a closer neighbor, you also tighten up the range of x that you need to search.

Also, if P2 is the closest neighbor to P1, then I would use P1 as the initial guess for the closest neighbor to P2.

EDIT: On second thought, I'd sort by whichever dimension has the largest range.

There are some fairly standard ways of improving this kind of search, and how complicated you want to get depends on how many points you are searching.

A fairly common easy one is to sort the points by X or Y. For each point you then look for near points, going both forwards and backwards in the array. Remember how far away the nearest one you have found is, and when the difference in X (or Y) is greater than that you know there can't be any nearer point left to find.

You can also partition your space using a tree. Wikipedia has a page that gives some possible algorithms. Sometimes the cost to set them up is larger than what you save. That's the sort of thing you have to decide based on how many points you are searching.

Either use a kd-tree, or use a good library for nearest neighbor search. Weka includes one.

Another possibility, simpler than creating a kd-tree, is using a neighborhood matrix.

First place all your points into a 2D square matrix. Then you can run a full or partial spatial sort, so points will became ordered inside the matrix.

Points with small Y could move to the top rows of the matrix, and likewise, points with large Y would go to the bottom rows. The same will happen with points with small X coordinates, that should move to the columns on the left. And symmetrically, points with large X value will go to the right columns.

After you did the spatial sort (there are many ways to achieve this, both by serial or parallel algorithms) you can lookup the nearest points of a given point P by just visiting the adjacent cells where point P is actually stored in the neighborhood matrix.

You can read more details for this idea in the following paper (you will find PDF copies of it online): Supermassive Crowd Simulation on GPU based on Emergent Behavior.

The sorting step gives you interesting choices. You can use just the even-odd transposition sort described in the paper, which is very simple to implement (maybe even in CUDA). If you run just one pass of this, it will give you a partial sort, which can be already useful if your matrix is near-sorted. That is, if your points move slowly, it will save you a lot of computation.

If you need a full sort, you can run such even-odd transposition pass several times (as described in the following Wikipedia page):

http://en.wikipedia.org/wiki/Odd%E2%80%93even_sort

If the changes are small, one or two even-odd passes will suffice to get the array sorted again.

If your points are relatively close together, you can sort by distance from some point (I think it can be any point, but it may have to be a point for which all the points are in the same quadrant if that point is treated as the origin).

Lets say the point of interest is point A and has distance D.

Pick the closest point that is within some relatively small n indexes from the point A in the sorted list (using a larger n provides for a probably better initial guess, but will take longer). If that point has linear distance g from point A, you know that that the closest point has to be at most g from A. This way you only have to consider points in the list with distance between D-g and D+g.

Drawing out a chart might help to understand it. If anybody cares I'll add a diagram.

继续阅读：arrays nearest-neighbor points

Find closest point of every point (Nearest Neighbor)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？