开发者

Matrix multiplication running times Python < C++ < Matlab - Explain

I have a matrix M thats's 16384 x 81. I want to compute M * M.t (the result will be 16384x16384).

My question is: could somebody please explain the running time differences?

Using OpenCV in C++ the following code takes 18 seconds

#include <cv.h>
#include <cstdio>
using namespace cv;
int main(void) {
  Mat m(16384, 81, CV_32FC1);
  randu(m, Scalar(0), Scalar(1));
  int64 tic = getTickCount();
  Mat m2 = m * m.t();
  printf("%f", (getTickCount() - tic) / getTickFrequency());
}

In Pytho开发者_如何学JAVAn the following code takes only 0.9 seconds 18.8 seconds (see comment below)

import numpy as np
from time import time
m = np.random.rand(16384, 81)
tic = time()
result = np.dot(m, m.T)
print (time() - tic)

In MATLAB the following code takes 17.7 seconds

m = rand(16384, 81); 
tic;
result = m * m';
toc;

My only guess would have been that it's a memory issue, and that somehow Python is able to avoid swap space. When I watch top, however, I do not see my C++ application using all the memory, and I had expected that C++ would win the day. Thanks for any insights.

Edit

After revising my examples to time only the operation, the code now takes 18 seconds with Python, also. I'm really not sure what's going on, but if there's enough memory, they all seem to perform the same now.

Here are timings if the number of rows is 8192: C++: 4.5 seconds Python: 4.2 seconds Matlab: 1.8 seconds


What CPU are you running on? For modern x86 and x64 chips with dynamic clocking, getTickCount and getTickFrequency cannot be trusted.

18 seconds is long enough to get acceptable precision from the standard OS functions based on the timer interrupt.

And what BLAS are you using with OpenCV? MatLab installs some highly optimized ones, IIRC even detecting your CPU and loading either Intel's or AMD's math library appropriately.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜