开发者

openMP and SSE, my program doesn't speed up

Here is a part of my code which runs parallel:

timer.Start();
        for(int i = 0; i < params.epochs; ++i)
        {
            #pragma omp for
            for(int j = 0; j < min_net; ++j)
            {
                std::pair<CVectorSSE,CVectorSSE>& sample = data_set[j];
                nets[j]->Approximate(sample.first,net_outputs[j]);
                out_gradients[j].SetDifference(net_outputs[j],sample.second);
                nets[j]->BackPropagateGradient(out_gradients[j],net_gradients[j]);
            }

        }
        timer.Stop();

epochs = 100

I have AMD athlon X2 5000+

When I launch this code without omp directive the time is same... And when I look on task manager / performance when runing both programs (with/without omp) in both cases 2 cores are used... So it seems that VS (VS 2008) somehow optimizes code like omp???

The code inside parallel loop uses SSE instructions... I was wondering that maybe in multicore procs there is only one SSE unit but it would be stupid... So maybe some1 can tell me what i am doing wrong? I know that it depends on my code inside the loop but if this code inside is parallel then it MUST speed up...

Okay I am definitly doing something wrong - look at this code:

time_t start;
time_t stop;

start = time(NULL);
#pragma omp for
for(int i = 0; i < 10; ++i)
{
    Sleep(1000);
}
stop = time(NULL);

cout<<difftime(stop,start)&开发者_JS百科lt;<endl;

without omp it should sleep for 10 secs (10*1000ms) with omp it should sleep less than 10 secs because 2 threads can sleep in one time right? BUT it sleeps again 10 secs - how it is possible?


I tried the second example on Linux with gcc. My program runs for 3 secs on Core i3. I guess the problem you are having is that you have not configured OpenMP correctly. GCC need an option -fopenmp to enable OpenMP. Similar configuration may be necessary for VS.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜