开发者

Multiplication algorithm for abritrary precision (bignum) integers

I'm writing a small bignum library for a homework project. I am to implement Karatsuba multiplication, but before that I would like to write a naive multiplication routine.

I'm following a guide written by Paul Zimmerman titled "Modern Computer Arithmetic" which is freely available online.

On page 4, there is a description of an algorithm titled BasecaseMultiply which performs gradeschool multiplication.

I understand step 2, 3, where B^j is a digit shift of 1, j times. But I don't understand step 1 and 3, where we have A*b_j. How is this multiplication meant to be carried out if the bignum multiplication hasn't been defined yet?

Would the operation "*" in this algorithm just be the repeated addition method?

Here is the parts I have written thus far. I have unit tested them so they appear to be correct for the most part:

The structure I use for my bignum is as follows:

#define BI开发者_运维问答GNUM_DIGITS 2048
typedef uint32_t u_hw; // halfword
typedef uint64_t u_w; // word

typedef struct {
    unsigned int sign; // 0 or 1
    unsigned int n_digits;
    u_hw digits[BIGNUM_DIGITS];
} bn;

Currently available routines:

bn *bn_add(bn *a, bn *b); // returns a+b as a newly allocated bn
void bn_lshift(bn *b, int d); // shifts d digits to the left, retains sign
int bn_cmp(bn *a, bn *b); // returns 1 if a>b, 0 if a=b, -1 if a<b


I wrote a multiplication algorithm a while ago, and I have this comment at the top. If you have two numbers x and y of the same size (same n_digits) then you would multiply like this to get n, which would have twice the digits. Part of the complexity of the algorithm comes from working out which bits not to multiply if n_digits is not the same for both inputs.

Starting from the right, n0 is x0*y0 and you save off the overflow. Now n1 is the sum of x1*y0 and y1*x0 and the previous overflow shifted by your digit size. If you are using 32 bit digits in 64 bit math, that means n0 = low32(x0*y0) and you carry high32(x0*y0) as the overflow. You can see that if you actually used 32 bit digits you could not add the center columns up without exceeding 64 bits, so you probably use 30 or 31 bit digits.

If you have 30 bits per digit, that means you can multiple two 8 digit numbers together. First write this algorithm to accept two small buffers with n_digits up to 8 and use native math for the arithmetic. Then implement it again, taking arbitrary sized n_digits and using the first version, along with your shift and add method, to multiply 8x8 chunks of digits at a time.

/*
    X*Y = N

                          x0     y3
                            \   /  
                             \ /   
                              X    
                      x1     /|\     y2
                        \   / | \   /  
                         \ /  |  \ /   
                          X   |   X    
                  x2     /|\  |  /|\     y1
                    \   / | \ | / | \   /  
                     \ /  |  \|/  |  \ /   
                      X   |   X   |   X    
              x3     /|\  |  /|\  |  /|\     y0
                \   / | \ | / | \ | / | \   /
                 \ /  |  \|/  |  \|/  |  \ /
                  V   |   X   |   X   |   V
                  |\  |  /|\  |  /|\  |  /|
                  | \ | / | \ | / | \ | / |
                  |  \|/  |  \|/  |  \|/  |
                  |   V   |   X   |   V   |
                  |   |\  |  /|\  |  /|   |
                  |   | \ | / | \ | / |   |
                  |   |  \|/  |  \|/  |   |
                  |   |   V   |   V   |   |
                  |   |   |\  |  /|   |   |
                  |   |   | \ | / |   |   |
                  |   |   |  \|/  |   |   |
                  |   |   |   V   |   |   |
                  |   |   |   |   |   |   |
              n7  n6  n5  n4  n3  n2  n1  n0
*/


To do A*b_j, you need to do the grade school multiplication of a bignum with a single digit. You end up having to add a bunch of two-digit products together:

bn *R = ZERO;
for(int i = 0; i < n; i++) {
  bn S = {0, 2};
  S.digits[0] = a[i] * b_j;
  S.digits[1] = (((u_w)a[i]) * b_j) >> 32;  // order depends on endianness
  bn_lshift(S, i);
  R = bn_add(R, S);
}

Of course, this is very inefficient.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜