开发者

Number of active warps in GPU (Fermi)

I have a quick question about the active warp开发者_如何学运维s in GPU (I would prefer to know it in Fermi). For specific kernel, is the number of active warps at any cycle in a SM the same for the whole execution time of the kernel? As I experimented, there is some correlation between the total number of active warps (for the whole execution) and the number of synchronizations in the program kernel. Can anyone clarify this relation? Thanks


The number of active warps can vary over time since:

  • Other threadblocks can complete or begin on the same SM, so if you have four warps per threadblock then if only one threadblock is resident on the SM you would have up to four warps, but with two or three threadblocks you would have up to eight or twelve resp.
  • If a warp reaches the end of their code then it will no longer be executing code (naturally)

The active warps count for a whole program execution would depend on a number of factors, but remember that it is incremented by the number of active warps on each cycle. This means if you increase the number of syncs, which would also increase the number of cycles each warp requires to execute the kernel, then you would expect a higher active warps count.

Also note that some derived statistics in the profiler are approximate since they often use values from more than one run, hence there can be some variability.


The relationship between the barrier synchronization and wrap is explained in this paper, Demystifying GPU Microarchitecture through Microbenchmarking.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜