开发者

In GCC-style extended inline asm, is it possible to output a "virtualized" boolean value, e.g. the carry flag?

If I have the following C++ code to compare two 128-bit unsigned integers, with inline amd-64 asm:

struct uint128_t {
    uint64_t lo, hi;
};
inline bool operator< (const uint128_t &a, const uint128_t &b)
{
    uint64_t temp;
    bool result;
    __asm__(
        "cmpq %3, %2;"
        "sbbq %4, %1;"
开发者_运维百科        "setc %0;"
        : // outputs:
        /*0*/"=r,1,2"(result),
        /*1*/"=r,r,r"(temp)
        : // inputs:
        /*2*/"r,r,r"(a.lo),
        /*3*/"emr,emr,emr"(b.lo),
        /*4*/"emr,emr,emr"(b.hi),
        "1"(a.hi));
    return result;
}

Then it will be inlined quite efficiently, but with one flaw. The return value is done through the "interface" of a general register with a value of 0 or 1. This adds two or three unnecessary extra instructions and detracts from a compare operation that would otherwise be fully optimized. The generated code will look something like this:

    mov    r10, [r14]
    mov    r11, [r14+8]
    cmp    r10, [r15]
    sbb    r11, [r15+8]
    setc   al
    movzx  eax, al
    test   eax, eax
    jnz    is_lessthan

If I use "sbb %0,%0" with an "int" return value instead of "setc %0" with a "bool" return value, there's still two extra instructions:

    mov    r10, [r14]
    mov    r11, [r14+8]
    cmp    r10, [r15]
    sbb    r11, [r15+8]
    sbb    eax, eax
    test   eax, eax
    jnz    is_lessthan

What I want is this:

    mov    r10, [r14]
    mov    r11, [r14+8]
    cmp    r10, [r15]
    sbb    r11, [r15+8]
    jc     is_lessthan

GCC extended inline asm is wonderful, otherwise. But I want it to be just as good as an intrinsic function would be, in every way. I want to be able to directly return a boolean value in the form of the state of a CPU flag or flags, without having to "render" it into a general register.

Is this possible, or would GCC (and the Intel C++ compiler, which also allows this form of inline asm to be used) have to be modified or even refactored to make it possible?

Also, while I'm at it — is there any other way my formulation of the compare operator could be improved?


Here we are almost 7 years later, and YES, gcc finally added support for "outputting flags" (added in 6.1.0, released ~April 2016). The detailed docs are here, but in short, it looks like this:

/* Test if bit 0 is set in 'value' */
char a;

asm("bt $0, %1"
    : "=@ccc" (a)
    : "r" (value) );

if (a)
   blah;

To understand =@ccc: The output constraint (which requires =) is of type @cc followed by the condition code to use (in this case c to reference the carry flag).

Ok, this may not be an issue for your specific case anymore (since gcc now supports comparing 128bit data types directly), but (currently) 1,326 people have viewed this question. Apparently there's some interest in this feature.

Now I personally favor the school of thought that says don't use inline asm at all. But if you must, yes you can (now) 'output' flags.

FWIW.


I don't know a way to do this. You may or may not consider this an improvement:

inline bool operator< (const uint128_t &a, const uint128_t &b)
{
    register uint64_t temp = a.hi;
    __asm__(
        "cmpq %2, %1;"
        "sbbq $0, %0;"
        : // outputs:
        /*0*/"=r"(temp)
        : // inputs:
        /*1*/"r"(a.lo),
        /*2*/"mr"(b.lo),
        "0"(temp));

    return temp < b.hi;
}

It produces something like:

mov    rdx, [r14]
mov    rax, [r14+8]
cmp    rdx, [r15]
sbb    rax, 0
cmp    rax, [r15+8]
jc is_lessthan
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜