How to measure CPU cycles per instruction in a C program
I have a C program where I am starting to use some SIMD optimizations for the SPE (Cell processor), etc. I would like somehow to "time" how many cycles do they need. One idea is to switch on/off and measure whole execution time. But this is开发者_开发问答 slow. I can also add between and before the execution gettimeofday(&start,NULL) and so statements, but they are only precise I think when one copes with more than miliseconds.
I wonder if it is possible to measure efficiently the nanoseconds per instruction or just the CPU cycles or some other precise timing measure.
Depending on your CPU you may be able to get at performance registers within the CPU itself which track instruction clocks and many other useful things. Profilers and other performance utilities can do this, so it should also be possible from user code too. On Mac OS X I would use the Apple CHUD framework, but you didn't state what OS or CPU you are using so it's hard to give specific suggestions.
Execute the code to be tested in a loop and divide the time it takes with the loop counter. The timer you use must not be high resolution to measure correct values.
Nano seconds won't be enough for that. You need picoseconds.
I don't think that you can measure something like this reliably. You will have to look into specifications (I'm not sure if current CPUs have this information documented).
As a not C guy... my guess is you need to look at the assembly code, and go from there. The only problem is a single instruction could take 1 or 100000 cpu cycles, depending on the exact CPU you are on.
精彩评论