开发者

Java Jama matrix

I am working with Jama Matrix. I used it for LSI. It works all fine. However, when I pass a big matrix like, 8000x8000 it kills my whole system. I am simply calling SVD then reducing matrix size and adding up. Nothing else !

Any idea? How can I solve this problem?

core2du

Ram = 10GB

Java runtime setting

-Xmx5000M

Th开发者_运维知识库ere is no other program running while I execute Jama matrix code


I also use the Jama for SVD and have the same problem in solving big matrix. In order to reduce memory overflow cases, I have tuned SingularValueDecomposition.java to compact one. The tuning is that in the matrix A it has so many 0 (zero) value so the compact for all of matrices used such as A, U, V, Work, etc only allows memory for available value which is more than 0. Before you use the compact SVD, you should make A matrix file like r /t c /t value /n r /t c /t value /n ... '/t' and '/n' means tab and new line respectively.

examples, (0, 0, 0), (0, 1, 0.5), (0, 2, 0), (0, 3, 0.2), (1, 0, 1), (1, 1, 0), (1, 2, 0), (1, 3, 0.3) and the matrix size is 2*4 (R*C) then you just make file like MATRIXSIZE /t 2 /t 4 /n 0 /t 1 /t 0.5 /n 0 /t 3 /t 0.2 /n 1 /t 0 /t 1 /n 1 /t 3 /t 0.3 /n

If you would like to use it, please give me your email address (mg.hwang@gmail.com). I will give more details to use it on email.

I checked the result was right. However, I am not sure how much effective it is for the computer. Anyway, it works and shows better even if not much.


You're probably facing an out of memory condition. You might want to increase the memory available to your JVM by using the -Xmx option, for instance-Xm256m will give your JVM 256 MB, instead of the default of 64 MB.

You may also consider using alternative libraries that handle memory efficient matrix representations, using the models for sparse matrices like COO, DOK, CSR, etc... Lookup the Wikipedia entry for "sparse matrix" for more details.

This thread provides several alternatives to Jama, maybe this'll help you as well.


Peter Taylor is absolutely right.

It is an exponential big o issue. After all calculating the SVD of a 8000 X 8000 matrix is not a piece of cake since you are talking 64,000,000 elements!

If you run the JAMA MagicSquareExample, with a:

32x32 matrix elapse time is 0.062 sec.
64x64 will go up to 0.0328 sec
96x96 will elapse in 1.891 sec
128x128 in 4.5 sec
160x160 in 11.109 sec
192x192 in 24.063 sec
224x224 in 46.063 sec
256x256 in 83.625 sec
512x512 in 1716.719 sec

Java Jama matrix


If you're doing LSI, then you can make two important optimizations. First, your matrix is sparse (assuming you're using a term-by-document matrix). JAMA operates on dense matrices, so you may want to look into finding a different representation. As Lolo mentioned, this will greatly reduce your overhead.

Second, LSI only requires the top k singular vectors to be computed. JAMA computes all of the singular values, which is unnecessary in your case. Moreover, if you only need to the k largest, you can optimize further by using a thin-SVD, which has a significantly lower memory overhead. Computing the full-SVD for LSI will become almost impossible for large document collections, so you'll eventually have to switch from something other than JAMA if you want to scale.

SVDLIBJ is one possibility for performing a thin-SVD in Java. The S-Space Package also has a SVDLIBJ wrapper and command line tool for it, as well as a LSI/LSA implementation if you want to avoid writing LSI altogether.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜