x86 assembler: floating point compare
As part of a compiler project I have to write GNU assembler code for x86 to compare floating point values. I have tried to find resources on how to do this online and from what I understand it works like this:
Assuming the two values I want to compare are the only values on the floating point stack, then the fcomi
instruction will compare the values and set the CPU-flags so that the je
, jne
, jl
, ... instructions can be used.
I'm asking because this only works sometimes. For example:
.section .data
msg: .ascii "Hallo\n\0"
f1: .float 10.0
f2: .float 9.0
.globl main
.type main, @function
main:
flds f1
flds f2
fcomi
jg leb
pushl $msg
call printf
addl $4, %esp
leb:
pushl $0
call exit
will not print "Hallo" even though I think it should, and if you switch f1 and f2 it still won't which is a logical contradiction. je
and jne
however seem to work fine.
What am I doing wrong?
PS: does the fcomip pop only one value or doe开发者_如何学运维s it pop both?
TL:DR: Use above / below conditions (like for unsigned integer) to test the result of compares.
For various historical reasons (mapping from FP status word to FLAGS via fcom
/ fstsw
/ sahf
which fcomi
(new in PPro) matches), FP compares set CF, not OF / SF. See also http://www.ray.masmcode.com/tutorial/fpuchap7.htm
Modern SSE/SSE2 scalar compares into FLAGS follow this as well, with [u]comiss
/ sd
. (Unlike SIMD compares, which have a predicate as part of the instruction, as an immediate, since they only produce a single all-zeros / all-ones result for each element, not a set of FLAGS.)
This is all coming from Volume 2 of Intel 64 and IA-32 Architectures Software Developer's Manuals.
FCOMI
sets only some of the flags that CMP
does. Your code has %st(0) == 9
and %st(1) == 10
. (Since it's a stack they're loaded onto), referring to the table on page 3-348 in Volume 2A you can see that this is the case "ST0 < ST(i)", so it will clear ZF and PF and set CF. Meanwhile on pg. 3-544 Vol. 2A you can read that JG
means "Jump short if greater (ZF=0 and SF=OF)". In other words it's testing the sign, overflow and zero flags, but FCOMI
doesn't set sign or overflow!
Depending on which conditions you wish to jump, you should look at the possible comparison results and decide when you want to jump.
+--------------------+---+---+---+ | Comparison results | Z | P | C | +--------------------+---+---+---+ | ST0 > ST(i) | 0 | 0 | 0 | | ST0 < ST(i) | 0 | 0 | 1 | | ST0 = ST(i) | 1 | 0 | 0 | | unordered | 1 | 1 | 1 | one or both operands were NaN. +--------------------+---+---+---+
I've made this small table to make it easier to figure out:
+--------------+---+---+-----+------------------------------------+ | Test | Z | C | Jcc | Notes | +--------------+---+---+-----+------------------------------------+ | ST0 < ST(i) | X | 1 | JB | ZF will never be set when CF = 1 | | ST0 <= ST(i) | 1 | 1 | JBE | Either ZF or CF is ok | | ST0 == ST(i) | 1 | X | JE | CF will never be set in this case | | ST0 != ST(i) | 0 | X | JNE | | | ST0 >= ST(i) | X | 0 | JAE | As long as CF is clear we are good | | ST0 > ST(i) | 0 | 0 | JA | Both CF and ZF must be clear | +--------------+---+---+-----+------------------------------------+ Legend: X: don't care, 0: clear, 1: set
In other words the condition codes match those for using unsigned comparisons. The same goes if you're using FMOVcc
.
If either (or both) operand to fcomi
is NaN, it sets ZF=1 PF=1 CF=1
. (FP compares have 4 possible results: >
, <
, ==
, or unordered). If you care what your code does with NaNs, you may need an extra jp
or jnp
. But not always: for example, ja
is only true if CF=0 and ZF=0, so it will be not-taken in the unordered case. If you want the unordered case to take the same execution path as below or equal, then ja
is all you need.
Here you should use JA
if you want it to print (ie. if (!(f2 > f1)) { puts("hello"); }
) and JBE
if you don't (corresponds to if (!(f2 <= f1)) { puts("hello"); }
). (Note this might be a little confusing due to the fact that we only print if we don't jump).
Regarding your second question: by default fcomi
doesn't pop anything. You want its close cousin fcomip
which pops %st0
. You should always clear the fpu register stack after usage, so all in all your program ends up like this assuming you want the message printed:
.section .rodata
msg: .ascii "Hallo\n\0"
f1: .float 10.0
f2: .float 9.0
.globl main
.type main, @function
main:
flds f1
flds f2
fcomip
fstp %st(0) # to clear stack
ja leb # won't jump, jbe will
pushl $msg
call printf
addl $4, %esp
leb:
pushl $0
call exit
精彩评论