Why is my OpenMP implementation slower than a single threaded implementation? (Followup)
This is a follow up to Why is my OpenMP implementation slower than a single threaded implementation? .
I have adhered to the answer provided, and used tasking instead of for pragmas to speed up the code.开发者_运维知识库 However, compared to a sequential (same) program, both programs run equally as fast. I witness no speed up.
The reworked code is here: http://pastebin.com/3SFaNEc4
I simply removed all the for pragmas and replaced it tasking pragmas for the recursive procedures.
Am I doing anything wrong? I should be seeing an almost linear speed up. What do you guys think?
Thanks!
First - you still have an "#pragma end critical" which should be removed. It isn't causing a problem, but it is incorrect. Second - as I said in the other question you posted, you might have to think about how you are parallelizing the code to see the speedup, so just replacing the other pragmas with task pragmas may not speed it up. Third - you haven't put the tasks into a parallel region, so you are not running in parallel at all. And you can't just add a parallel region around the tasks or you are going to be doing the same tasks multiple times.
精彩评论