开发者

Unsigned modulos: alternative approach?

I'm in a need to optimize this really tiny, but pesky function.

unsigned umod(int a, unsigned b)
{
    while(a < 0)
        a += b;

    return a % b;
}

Before you cry out "You don't need to optimize it", please keep in mind that this function is called 50% 开发者_开发百科of the entire program lifetime, as it's being called 21495808 times for the smallest test case benchmark.

The function is already being inlined by the compiler so please don't offer to add the inline keyword.


This avoids looping:

int tmp = a % b;
if (tmp < 0) tmp += b;

Notice that both a and b need to be signed.


This should do it:

unsigned umod(int a, unsigned b)
{
    if (a < 0)
    {
        unsigned r = (-a % b);
        if (r)
            return b - r;
        else
            return 0;
    }
    else
        return a % b;
}

Tested to match original. Limitation is that a > INT_MIN on 2s complement machines.


Using the ~ :)

unsigned umod(int a, unsigned b)
{
    if (a<0) return b-1-~a%b;
    return a%b;
}

The % has higher precedence than -

If it's ok to return b instead of 0 when -a is a multiple of b you can save some ops

unsigned umod(int a, unsigned b)
{
    if (a<0) return b - (-a % b);
    return a%b;
}

slightly golfed version :)

unsigned umod(int a, unsigned b)
{
return(a<0)?b-(-a%b):a%b;
}

Here is the resulting assembly

1    .globl umod3
2       .type   umod3, @function
3    umod3:
4    .LFB3:
5       .cfi_startproc
6       testl   %edi, %edi
7       js      .L18
8       movl    %edi, %eax
9       xorl    %edx, %edx
10      divl    %esi
11      movl    %edx, %eax
12      ret
13      .p2align 4,,10
14      .p2align 3
15   .L18:
16      movl    %edi, %eax
17      xorl    %edx, %edx
18      negl    %eax
19      divl    %esi
20      subl    %edx, %esi
21      movl    %esi, %edx
22      movl    %edx, %eax
23      ret


Since the looping version seems to be quite fast, lets try eliminating the division :)

unsigned umod(int a, unsigned b){
    while(a>0)a-=b;
    while(a<0)a+=b;
    return a;
}


Portable edition, still with only one division, no branching, and no multiplication:

unsigned umod(int a, unsigned b) {
    int rem = a % (int) b;
    return rem + (-(rem < 0) & b);
}


In your original function, you could have returned after the while loop finished for negative numbers, thus skipping the mod. This is in the same spirit, replacing the loop with a multiply - although it could be made to have fewer characters...

unsigned int umod2(int a, unsigned int b)
{
    return (a < 0) ? a + ((-a/b)+1)*b : a % b;
}

Here's the loop version:

unsigned int umod2_works(int a, unsigned int b)
{
    if (a < 0)
    {
        while (a < 0)
            a += b;
        return a;
    } else {
        return a % b;
    }
}

Both have been tested to match the OP's original function.


In a % b, if any of the operands is unsigned both are converted to unsigned. This means that if a is negative, you get a modulo UINT_MAX + 1 value instead of a. If UINT_MAX+1 is evenly divisible by b, then things are fine, and you can just return a % b. If not, you have do do the modulo in int type.

unsigned int umod(int a, unsigned int b)
{
    int ret;
    if (a >= 0) return a % b;
    if (b > INT_MAX) return a + b;
    ret = a % (int)b;
    if (ret < 0) ret += b;
    return ret;
}

Edit: Updated, but you should use caf's answer as that's simpler (or maybe not?!). This is here for the record.


int temp;

temp= (a > 0)? ( a % b ) :   b -( (-a) % b ) ;

code below:

int main()
{
int a;
unsigned b;
int temp;
printf("please enter an int and a unsigned number\n");
scanf("%d",&a);
scanf("%u",&b);
modulus(a,b);
temp= (a > 0)? ( a % b ) :   b -( (-a) % b ) ;
printf("\n temp is %d", temp);
return 0;
}
void modulus(int x,unsigned y)
{
int c;
if(x>0)
{
c=x%y;
printf("\n%d\n",c);}
else
{
while(x<0)
x+=y;
printf("\n%d\n",x);}
}


./a.out
please enter an int and a unsigned number
-8 3

1

 temp is 1


Here's one that works over the whole range of unsigned without branching, but it uses multiplications and 2 divisions

unsigned umod(int a, unsigned b)
{
    return (a>0)*a%b+(a<0)*(b-1-~a%b);
}


If a and b are both much smaller than an int, then you can just add a sufficiently large multiple of b to every value before you mod.

unsigned umod(int a, unsigned b)
{
    return (unsigned)(a + (int)(b * 256)) % b;
}

Of course, this trick doesn't work if a + (b * 256) can overflow, but for a lot of the uses I can see for this code, you can be certain that it never will.


Other than the while loop, Not sure whether the % operation can be optimized as a whole, but the optimization can happen on the pattern of the values for a & b.

If in those 21495808 times the operation is executed.

If the chances of passing a value for a which is less than b ( a < b ) is atleast half of the that. Adding the following statement will definetly improve the overall performance of the function.

if ( abs(a) < b ) // not specifically the abs function, can be your own implementation.
    return 0;
else
    return a%b;

If b is a power of 2 for atleast 80% of the cases, we can use bitwise operators as in

return ( abs(a) & (b-1) );

If the numbers are expected to be anything less than that, it would degrade the performance, as we need to verify whether b is power of 2 [ even after using bitwise operators for the same ] for everything.

Even the functionality to achieve abs(a) can be optimized using bitwise operators, with their own limitations, but is faster than verifying whether a is < 0.

n = (a ^ (a >> 31)) - (a >> 31); // instead of n = a < 0 ? -a : a;

There would be more such things, if you can explore.


My preferred solution is to mod twice. I haven't tried this in C/C++ or with unsigned, but my test cases work in Java:

((a % b) + b) % b

The benefit is no branching and simplicity. The downside is the double mod. I haven't compared performance, but my understanding is that it is branching that hurts performance these days.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜