开发者

How to integer-divide round negative numbers *down*?

Seems like whenever I divide a negative int by a positive int, I need it to round down (toward -inf), not toward 0. But both C# and C++ round toward 0.

So I guess I need a DivideDownward() method. I can write开发者_如何学Go it in a few lines with a test for negative and so on, but my ideas seem klugey. So I'm wondering if I'm missing something and if you have an "elegant" way to round negative division downward.


Whenever I divide a negative int by a positive int, I need it to round down.

It's hell, isn't it? Knuth wrote why this is the right way to do things, but we're stuck with legacy integer hardware.

  • If you can afford the loss of precision, the simplest and cleanest way to do this is to cast a 32-bit integer to a 64-bit double and use the FP rounding mode to round toward minus infinity when you convert the quotient back to integer. Today's floating-point units are pretty fast and may actually divide faster than an integer unit; to be sure, you'd have to measure.

  • If you need full 64-bit integer precision, I've dealt with this problem as a compiler writer by doing the two conditional branches so that you wind up dividing the magnitudes, then get the correct sign. But this was a while back when the conditional branch was cheap compared to a divide; on today's hardware, I would have to experiment before I'd be able to recommend something.

  • In principle, you could pull the floating-point trick on 64-bit ints by using the legacy Intel 80-bit floating-point numbers, but it's wildly unportable, and I don't trust Intel to keep making that unit fast. These days the floating point speed is in the SSE unit.

  • Places to look for other tricks would include Hank Warren's book Hacker's Delight (my copy is at work) and the MLton compiler for Standard ML, which requires integer division to round toward minus infinity.

Whatever you do, when you get settled on it, if you are using C++ or C99, stick your divide routine into a .h file and make it static inline. That way when your solution turns out to be suboptimal for new whizbang hardware delivered in 5 years, you have one place to change it.


Assuming b is always positive, here is a cheap solution:

inline int roundDownDivide(int a, int b) {
  if (a >= 0) return a/b; 
  else return (a-b+1)/b;
}


You can get rid of any branching by doing this:

inline int DivideRoundDown(int a_numerator, int a_denominator)
{
    return (a_numerator / a_denominator) + ((a_numerator % a_denominator) >> 31);
}


If you wanted to write this just using integers in a relatively succinct way, then you can write this:

var res = a / b - (a % b < 0 ? 1 : 0);

This probably compiles to quite a few instructions, but it may still be faster than using floating-points.


Note: this post produces incorrect results for input with a=-1. Please see other answers.


[c++]

This is probably what you meant by 'kludgey', but it's what I came up with.

int divideDown(int a, int b){
    int r=a/b;
    if (r<0 && r*b!=a)
        return r-1;
    return r;
}

In the if statement, I put r<0 - however I'm not sure if that's what you want. You may wish to change the if statement to

if (a<0 && b>0)

which would be consistent with your description "Seems like whenever I divide a negative int by a positive int ".


Many programming languages, including the older standards of C and C++, guarantee the division rule, which is

a = b * (a / b) + a % b

even if the exact values of a/b and a%b are left undefined.[0] This can be exploited to calculate the desired result in many languages and platforms using (an equivalent of) the following code:

int divF(int a, int b) { return a / b - (a % b < 0); }

This is the version from @TomasPetricek's answer. However, it works only for b > 0!

The following code works for any b != 0:[1]

int sign(int x) { return (x > 0) - (x < 0); }
int divF2(int a, int b) { return a / b - (sign(a % b) == -sign(b)); }

However, the rounding-down division (aka flooring division, aka Knuth's division) is not always desirable. It is argued[2] that Euclidean division is the most generally useful one. It rounds down for b > 0, and up for b < 0. It has the nice property that the value of the compatibly defined remainder is always non-negative, for all a and b, independently of their signs. Additionally it coincides with bit-shifts and masking on twos complement machines for power-of-two divisors. Yes, it is also faster to calculate:

int divE(int a, int b) {
    int c = a % b < 0;
    return a / b + (b < 0 ? c : -c);
}

All three versions generate branchless code with clang -O3 on amd64. However, on twos complement architectures, the following may be marginally faster:

int divE2(int a, int b) { return a / b + (-(a % b < 0) & (b < 0 ? 1 : -1)); }

Generated code

divF:
    cdq
    idiv    esi
    sar edx, 31
    add eax, edx

divF2:
    cdq
    idiv    esi
    xor ecx, ecx
    test    edx, edx
    setg    cl
    sar edx, 31
    add edx, ecx
    xor ecx, ecx
    test    esi, esi
    setg    cl
    shr esi, 31
    sub esi, ecx
    xor ecx, ecx
    cmp edx, esi
    sete    cl
    sub eax, ecx

Chazz:
    cdq
    idiv    esi
    test    edx, edx
    cmove   edi, esi
    xor edi, esi
    sar edi, 31
    add eax, edi

divE:
    cdq
    idiv    esi
    mov ecx, edx
    shr ecx, 31
    sar edx, 31
    test    esi, esi
    cmovs   edx, ecx
    add eax, edx

divE2:
    cdq
    idiv    esi
    sar edx, 31
    shr esi, 31
    lea ecx, [rsi + rsi]
    add ecx, -1
    and ecx, edx
    add eax, ecx

Benchmarks

simple truncating division:
    2464805950:  1.90 ns -- base

euclidean division:
    2464831322:  2.13 ns -- divE
    2464831322:  2.13 ns -- divE2

round to -inf for all b:
    1965111352:  2.58 ns -- Chazz
    1965111352:  2.64 ns -- divF2
    1965111352:  5.02 ns -- Warty

round to -inf for b > 0, broken for b < 0:
    1965143330:  2.13 ns -- ben135
    1965143330:  2.13 ns -- divF
    1965143330:  2.13 ns -- Tomas
    4112595000:  5.79 ns -- runevision

round to -inf, broken for b < 0 or some edge-cases:
    4112315315:  2.24 ns -- DrAltan
    1965115133:  2.45 ns -- Cam
    1965111351:  7.76 ns -- LegsDrivenCat

Compiled with clang -O3 on FreeBSD 12.2, i7-8700K CPU @ 3.70GHz. The first column is a checksum grouping algorithms producing the same result. base is the simplest truncating division used to measure the testing overhead.

The test-bed:

static const int N = 1000000000;
int rng[N][2], nrng;
void push(int a, int b) { rng[nrng][0] = a, rng[nrng][1] = b, ++nrng; }
SN_NOINLINE void test_case(int (*func)(), const char *name) {
    struct timespec t0, t1;
    clock_gettime(CLOCK_PROF, &t0);
    int sum = func();
    clock_gettime(CLOCK_PROF, &t1);
    double elapsed = (t1.tv_sec - t0.tv_sec)*1.e9 + (t1.tv_nsec - t0.tv_nsec);
    printf("%10u: %5.2f ns -- %s\n", sum, elapsed/N, name);
}

#define ALL_TESTS(X) X(base) X(divF) X(divF2) X(divE) X(divE2) X(ben135) X(DrAltan) X(Cam) X(Tomas) X(Warty) X(Chazz) X(runevision) X(LegsDrivenCat)

#define LOOP_TEST(X) \
    SN_NOINLINE int loop_##X() { \
        int sum = 0; \
        for(int i = 0; i < N; ++i) sum += X(rng[i][0], rng[i][1]); \
        return sum; \
    } \
/**/
ALL_TESTS(LOOP_TEST)

int main() {
    srandom(6283185);
    push(INT_MAX, 1); push(INT_MAX, -1); push(INT_MIN, 1);
    push(INT_MAX, 2); push(INT_MAX, -2); push(INT_MIN, 2); push(INT_MIN, -2);
    while(nrng < N) {
        int a = random() - 0x40000000, b;
        do b = (random() >> 16) - 0x4000; while(b == 0);
        push(a,b);
    }

#define CALL_TEST(X) test_case(loop_##X, #X);
    ALL_TESTS(CALL_TEST)
}

Footnotes/references

  1. Nowadays they are defined in C and C++ to round by truncation. So negatives are rounded upwards, positives downwards. This is what hardware divisors do. This is a bummer, cause it's the least useful rounding rule.
  2. Daan Leijen. Division and Modulus for Computer Scientists. December 2001.
  3. Raymond T. Boute. The Euclidean definition of the functions div and mod. April 1992.


Here's a version that works when the divisor is negative:

int floor_div2(int a, int b)
{
  int rem = a % b;
  int div = a/b;
  if (!rem)
    a = b;
  int sub = a ^ b;
  sub >>= 31;
  return div+sub;
}


Math.Floor((double)a/(double)b)
if you need it as an int, cast it afterwards
(int)Math.Floor((double)a/(double)b)


For powers of 2, depending on your compiler implementation (Visual Studio), you could use a right shift.

-5 >> 2 == -2


Dr Atlans solution above is the most general answer, but if you know that a isn't going to be less than some fixed number, you can add the smallest multiple of b that will make it positive and then adjust the result. For example, if a is always >= -10 * b, you can say:

return (a + 10 * b) / b - 10

or if you just want a correct result when a is positive or 0 and some negative result when a is negative (eg for testing if the result is >= 0 without including the range a = [-1 ; -b +1]) you can do

return (a + b) / b - 1;

which will return -1 for [-1 , -b + 1] as well as for [-b ; -2 * b + 1], but that doesn't matter if you're just trying establish if the result of the division is positive. Effectively we're just changing the rounding so that it rounds towards -1 instead of 0.


I'm no expert in low-level optimization, but here's a version with no branching (conditionals) which also doesn't require conversions to floating point numbers. Tested in C#, it seems to be significantly faster there than the versions using branching.

return (a - (((a % b) + b) % b)) / b;

This solution is partially derived from the discussions about how to do the best modulo function for negative numbers: Modulo of negative numbers


Edit: My solution didn't work in all cases. Here's a C# implementation of Dr.Atlan's solution:

/// <summary>
/// Divides a/b, rounding negative numbers towards -Inf.
/// </summary>
/// <param name="a">Dividend</param>
/// <param name="b">Divisor; must be positive for correct results</param>
/// <returns>a ÷ b</returns>
public static int Div(int a, int b)
{
    return a < 0 ? (a - b + 1) / b : a / b;
}


Do the calculation in an unsigned long by offsetting with 2^31, eg:

int floor_div(int a, int b)
{
  unsigned long aa = a + (1UL << 31);
  unsigned long cc = (aa / b) - (1UL << 31);
  return (int)cc
}

I have not tested this.


I did it this way (only one division, no multiplication, no modulo, looks like fastest solution):

(a < 0 && b > 0) ? (a - b + 1) / b : (a > 0 && b < 0) ? (a - b - 1) / b : a / b

I was curious what is faster: less branching or less multiplications/divisions/modulos, so I compared my solution with runevision's one and in this particular case less divisions is better than less branching. ben135's solution gives incorrect results in some cases (e.g. 10/-3 gives -3 but should be -4).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜