开发者

Creating a cluster centroid prone to noise

I'm working on a clustering algorithm to group similar ranges of real numbers. After I group them, I have to create one range for that cluster, i.e., cluster centroid. For example, if one cluster contains values <1,6>, <0,7> and <0,6>, that means that this cluster is for all those with values <0,7>. The question is how to create such a resulting range. I was thinking to take the min and max value of all values in the cluster, but that would mean that the algorithm is very sensitive on noise. I should do it somehow weighted, bu开发者_运维技巧t I'm not sure how. Any hints? Thanks.


Perhaps you can convert all ranges to their midpoints before running your clustering algorithm. That way you convert your problem into clustering points on a line. Previously, the centroid range could 'grow' and in the next iteration consume more ranges that perhaps should belong to another cluster.

midpoints = []
for range in ranges
    midpoints[range] = range.min + (range.max - range.min) / 2
end

After the algorithm is finished you can do as you previously suggested and take the min and max values of all the ranges in the cluster to create the range for that centroid.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜