开发者

Getting the index of closest data point to the centriods in Kmeans clustering in MATLAB

I am doing some clustering using K-means in MATLAB. As you might know the usage is as below:

[IDX,C] = kmeans(X,k)

where IDX gives the cluster number for each data point in X, and C gives the centroids for each cluster.I need to get the index(row number in the actual data set X) of the closest datapoint to the centroid. Does anyone know how I can do th开发者_如何转开发at? Thanks


The "brute-force approach", as mentioned by @Dima would go as follows

%# loop through all clusters
for iCluster = 1:max(IDX)
    %# find the points that are part of the current cluster
    currentPointIdx = find(IDX==iCluster);
    %# find the index (among points in the cluster)
    %# of the point that has the smallest Euclidean distance from the centroid
    %# bsxfun subtracts coordinates, then you sum the squares of
    %# the distance vectors, then you take the minimum
    [~,minIdx] = min(sum(bsxfun(@minus,X(currentPointIdx,:),C(iCluster,:)).^2,2));
    %# store the index into X (among all the points)
    closestIdx(iCluster) = currentPointIdx(minIdx);
end

To get the coordinates of the point that is closest to the cluster center k, use

X(closestIdx(k),:)


The brute force approach would be to run k-means, and then compare each data point in the cluster to the centroid, and find the one closest to it. This is easy to do in matlab.

On the other hand, you may want to try the k-medoids clustering algorithm, which gives you a data point as the "center" of each cluster. Here is a matlab implementation.


Actually, kmeans already gives you the answer, if I understand you right:

[IDX,C, ~, D] = kmeans(X,k); % D is the distance of each datapoint to each of  the clusters
[minD, indMinD] = min(D); % indMinD(i) is the index (in X) of closest point to the i-th centroid
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜