Looking for a C/C++ interface for efficient computation of huge sparse matrix in Linux
I am looking for a C/C++ interface for efficient computation of huge sparse matrix in Linux. The matrix could be of millions times millions/thousands. I have checked a few existing libraries, but it seems none of them satisfies all of my requirements,
1, I need to create a sparse matrix by dynamically adding elements to it. (not for SparseLib++)
2, I also need to be able to create a spa开发者_JS百科rse diagonal matrix so that I can scale the columns of another sparse matrix with different scalars. (didn't find a library for this, and maybe there is another way to scale the sparse matrix by columns)
3, It needs to support the operations of matrix multiplied with matrix/vector (Many libraries support these basic operations)
4, It needs to support entry-wise multiplication or division between two sparse matrices or vectors, like .* or ./ in MATLAB (didn't find a library for this, and I need this operation to screen out some entries of one sparse matrix with another sparse matrix)
5, Matrix inversion or linear solver. (Most libraries provide the solver for linear system)
I originally made use of scipy in Python to implement my algorithm. Python consumes too much memory and it is slow, and that's why I'd like to convert my program into C.
Thanks.
I would recommend CSparse by Tim Davis.
Update (2019): Tim Davis has since released SuiteSparse. This satisfies all requirements listed in the post, including the ability to incrementally building the matrix (see slide 34 in a RedisGraph presentation or "3.1.8 Non-blocking mode" in the SuiteSparse:GraphBLAS paper).
I was very happy using Intel's MKL.
Alas, most sparse matrix libraries use a format which makes very difficult to set elements dynamically (google for Compressed Sparse Row, which is the most common format).
As you said, most libraries provide you with all you requirements, except #1. For #2, it is usually easy once you understand the storage format.
Intel MKL provides the PARDISO solver which doesn't impose a format, it only requires you to be able to do matrix/vector products: you call the solver in a loop, get a return code from it, and perform what it asks you to do (multiplication by A, checking termination condition, preconditioning, ...). This way, you can use whatever storage scheme you need. It is useful when you don't want to store the matrix for instance, or if you have a funky storage format.
Your requirements (especially #1) makes an excellent candidate for quadtree based sparse matrices. However, I do not know any good implementation of this. If you manage to code it / find an implementation of it, I'd be grateful.
Have you had a look at LinBox? It supports several sparse matrix storage formats, some of which allow you to set entries after the matrix has been created:
// set m[i,j] to a
m.refEntry(i, j) = a;
I'm not sure whether your requirement 4. is satisfied out-of-the-box, but it can certainly be coded easily using the refEntry
method.
Try TAUCS or MUMPS.
I've personally tried TAUCS for a project solving sparse matrices of the order of million x million in image processing using it so that will give you an indication of the size it can deal with.
I agree with Alexandre that these packages come with their own "format" for how to encode the sparse matrix but that is inevitable because you have to tell the solver where the non-zero elements are. But on the bright side they are not difficult to learn. TAUCS by the way, uses the compressed column storage (CCS) format. What Alexandre is referring to is probably the compressed row storage (CRS). There is I think 1 more popular format which encodes explicitly the element value with the (i, j) "location" of the element in the matrix but that is about it.
For details on these packages, see www.matrixprogramming.com.
精彩评论