Is there an overhead when a C++ mex file finishes and passes data back to MATLAB?
I have written a mex file using C++ to increase the speed of slow 'for loops' in开发者_如何学运维 MATLAB. I have written two versions, one without openMP and one with. The gains achieved have been very good, however when testing the timings I have noticed, due to the multi-threading, an unexpected result: a time lag when the mex file passes back to MATLAB.
I run the programs through a master file in MATLAB, which calls both versions of the mex file and times them using tic-toc; it also calculates the loops itself. After each mex file has finished the time is displayed in the MATLAB command window, as the next calculation proceeds. Also when the multithreaded mex file starts it is obvious from watching the CPU usage, as both CPUs go to 100%. The code is in the format
Initial data generation for inputs....;
tic;
[Output] = mex_unthreaded(inputs...);
Time_unthreaded = toc
tic;
[Output_threaded] = mex_threaded(inputs...);
Time_threaded = toc
tic;
MATLAB loops...;
Time_MATALB = toc
From the non-openMP mex file it is not obvious when the C++ passes back to MATLAB, as there is no definite change in the CPU usage. However, when the openMP mex file runs there is an obvious end point to the C++ code when it passes back to MATLAB: the CPU usage drops. This means I can, roughly, time the C++ time from watching the CPU usage. There seems to be about a 20% lag in the time from when the CPU usage drops to when the timings in MATLAB appear. For example, from watching the CPU usage, I get ~300s for the CPU's being at 100% and then an additional ~75s until MATLAB registers ~377s from the tic-toc timing.
I can only think that this is some sort of overhead in the data being passed back to MATLAB, as the lag timings increase as the data outputted gets larger. The reason I am not comfortable with this result is that I thought the data was handled in C++ by pointers to the MATLAB mxArrays, in MATLAB memory. Therefore, there should be no passing back of any information.
The second possibility I have thought of is that MATLAB may be carrying out some form of analysis on the data once the mex file has ended, e.g. max-mins etc.
If anyone could shed some light of this matter it would be much appreciated.
Many Thanks
I'm able to call a mex function (which also uses OpenMP) on my system, passing a fairly large amount of data in and out (about 100 MB in, about 6 MB out) in under a millisecond according to tic
/toc
, which rather implies that any post-processing MatLab is doing in my case is very minimal.
What class of data are you returning (real or complex matrices, cell arrays, structure arrays, or something else) and how much data?
Remember that OpenMP might not be able to parallelize all your code, there's often a reduction step necessary at the end, or the split of computation might be uneven so one CPU takes longer, or you might have a chunk of processing without OpenMP pragmas. But I don't think returning to MatLab is causing the delay you see.
I think you are right about the lag being proportional to the size of the data. I realized that after my post! I have also run the code with timings in C++ using GetTickCount and get almost exactly the same time as with tic/toc.
I am using an intel i5 to test the multithread. It does seem that the split is not even, however it seems strange that it is always at the end and not throughout the calculations. The two cores go to 100% at the start of the code and stay there until towards the end. I am assuming that one thread must be finishing before the other although, from the loops, they should be very evenly spread between the two threads and not 20% different.
Thank you for your reply. I am satisfied now that there is no overhead in transfering data between MATLAB and the mex file. I just have the sort out my C++ multithread coding!!!
精彩评论