开发者

Intel Assembler optimization

I'm currently trying to optimize the code emitted from a home-made compiler, for a home-made language.

I've tried out Intel VTune to see where the bottlenecks are: http://www.imada.sdu.dk/~sorenh07/misc/vtune-assembly-optimization.png

I find it very impressive that a "subl"-instruction is responsible for over 38% of the clockticks in a program running for 30-90 seconds! Can anybody give an explanation why?

The "optimization report" feature in VTune apparently doesn't开发者_开发问答 exist for programs not compiled with icc. Does there exist a program which suggests optimization for assembler code? (that is, not code coming from a high-level language).


My guess is that it's the idivl instruction that's actually taking up the 38%...division taking longer makes a bit more sense than subtraction no?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜