Do Core i3/5/7 CPUs provide a mechanism to measure IPC?
All the Intel CPUs in the last decade (at least) include a set of performance monitors that count a variety of events. Do the latest Intel CPUs, Core i3, i5 and i7 (aka Nehalem) provide a mechanism to count Instructions Per Clock (IPC)? If so, how are they used?
If this is poss开发者_如何学Pythonible, I'll probably be writing the code for this in Assembly, but Windows or Linux system calls may also come in useful.
Yes, the Vtune from Intel (linux and windows) can measure IPC.
If you want to measure it by yourself with precise counters for some part of code, you need to use some performance api like PAPI or perfctr (both for linux).
They uses hardware performance counters, described in intel manuals http://www.intel.com/products/processor/manuals/
Volume 3D, Chapter 30 & appendix A. http://www.intel.com/Assets/PDF/manual/253669.pdf
Vtune uses the ratio of "Instructions Retired" and "Non-sleep clockticks " to compute CPI ("Cycles per instructions retired"). For Core2 the performance counters used are: "CPU_CLK_UNHALTED.CORE","INST_RETIRED.ANY"
This counters are the same for all Core* CPUs: Appendix A1 of Volume 3B, page384:
Table A-1. Architectural Performance Events
Event | Event Mask Mnemonic | Umask | Description
num.
3CH | UnHalted Core Cycles| 00H | Unhalted core cycles
C0H | Instruction Retired | 00H | Instruction retired
IPC is getting meaningless with the current crop of multiple-instructions-per-clock commands.
From i7 propoganda document:
The chip boasted a wider execution core, allowing the processor to complete up to four full instructions simultaneously, along with a more efficient 14-stage pipeline improving IPC (instructions per clock) in comparison to Pentium 4/D
Those IPC counts all depend on the type of code that is being executed.
精彩评论