GCC optimisation: use of ARM conditional instructions?
I'm looking at some code compiled for iOS in XCode (so compiled for ARM with gcc) and as far as I can see, the compiler has never used ARM's feature of allowing arbitrary instructions to have a condition attached to them, but instead always branches on a condition as would be the case on Intel and other architectures.
Is this simply a restriction of GCC (I can understand that it might be: that "condition = branch" is embedded at a too high a level in the compiler architecture to allow otherwise), or is there a particular optimisation flag that needs to be turned on to allow compilation of conditional instructions?
(Obviously I appreciate I'm making big assumptions about where use of conditional instructions "ought" to be used and would actually be an optimisation, but I hav开发者_如何学JAVAe experience of programming earlier ARM chips and using and analysing the output of Acorn's original ARM C compiler, so I have a rough idea.)
Update: Having investigated this more thanks to the information below, it turns out that:
- XCode compiles in Thumb-2 mode, in which conditional execution of arbitrary instructions is not available;
- Under some circumstances, it does however use the ITE (if-then-else) instruction to effectively produce instructions with conditional execution.
Seeing some actual assembly would make things clear, but I suspect that the default settings for iOS compilation prefer generation of Thumb code instead of ARM for better code density. While there are pseudo-conditional instructions in Thumb32 aka Thumb-2 (supported in ARMv7 architecture via the IT instruction), the original Thumb16 only has conditional branches. Also, even in ARM mode there are some instructions that cannot be conditional (e.g. many NEON instructions use the extended opcode space with condition field set to NV).
Yes, gcc does not really produce the most optimal code WRT conditional instructions. It works well in the most simple cases, but real code suffers from some pointless slowdowns that can be avoided in hand coded arm ASM. Just to give you a rough idea, I was able to get a 2x speedup for a very low level graphics blit method by doing the read/write and copy logic in ARM asm instead of the C code emitted by gcc. But, keep in mind that this optimization is only worth it for the most heavily used parts of your code. It takes a lot of work to write well optimized ARM asm, so don't even attempt it unless there is a real benefit in the optimization.
The first thing to keep in mind is that xcode uses Thumb mode by default, so in order to generate ARM asm you will need to add the -mno-thumb option to the module specific options for the specific .c file that will contain the ARM asm. Once the ARM asm is getting emitted, you will want to conditionally compile asm statements as indicated in the answer to the following question:
ARM asm conditional compilation question
精彩评论