k-nearest neighbour classifier but using a distribution?

2023-02-26 15:08 问答作者：

I am building a classifier for some 2D data.

I have some training data for which I know the classes and have plotted these on a graph to see the clustering.

To the observer, there are obvious, separate clusters, but unfortunately they are spread out over lines rather than in tight clusters. One line-spread goes up at about an 80 degree angle, another at 45 degree and another at about 10 d开发者_运维问答egrees from horizontal, but all three seem to point back to the origin.

I want to perform a nearest-neighbour classification on some test data, and from the looks of things, if the test data is very similar to the training data a 3-nearest-neighbour classifier would work fine, except when the data is close to the origin of the graph, in which case the three clusters are quite close together and there might be a few errors.

Should I be coming up with some estimated gaussian distributions for my clusters? If so, I'm not sure how I can combine this with a nearest neighbour classifier?

Be grateful for any input.

Cheers

Transform all your points to [r, angle], and scale r down to the range 0 to 90 too, before running nearest-neighbor.
Why ? NN uses Euclidean distance between points and centres (in most implementations),
but you want distance( point, centre ) to be more like sqrt( (point.r - centre.r)^2 + (point.angle - centre.angle)^2 )
than sqrt( (point.x - centre.x)^2 + (point.y - centre.y)^2 ) .
Scaling r down to 30 ? 10 ? would weight angle more than r, which seems to be what you want.

Why use k-NN for that purpose? any linear classifier would do the trick. try solving it with SVM and you'll get much better results. If you insist of using kNN, you clearly have to scale the features and transform them into polar ones as mentioned here.

继续阅读：knn statistics

k-nearest neighbour classifier but using a distribution?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？