开发者

proximity matrix in python

What is the best way to compute the distance/proximity matrix for very large sparse vectors? For example you are given the following design matrix, where each row is 68771 dimensional sparse vector.

开发者_开发知识库

designMatrix <5830x68771 sparse matrix of type '' with 1229041 stored elements in Compressed Sparse Row format>


Have you tried the routines in scipy.spatial.distance?

http://docs.scipy.org/doc/scipy/reference/spatial.distance.html

If this forces you to go to a dense representation, then you may be better off rolling your own, depending on the density of nonzero elements. You could squeeze out the zeros while retaining a map between the new and original indices, calculate the pairwise distances on the remaining nonzero elements and then use the indexing to map things back.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜