Fixed-point multiplication in a known range
I'm trying to multiply A*B
in 16-bit fixed point, while keeping as much accuracy as possible. A
is 16-bit in unsigned integer range, B
is divided by 1000 and always between 0.001
and 9.999
. It's been a while since I dealt with problems like that, so:
- I know I can just do
A*B/1000
after moving to 32-bit variables, then strip back to 16-bit - I'd like to make it faster than that
- I'd like to do all the operations without moving to 32-bit (since I've got 16-bit multiplication only)
Is there any easy way to do that?
Edit: A
will be between 0 and 4000, so all possible results are in the 16-bit开发者_运维技巧 range too.
Edit: B
comes from user, set digit-by-digit in the X.XXX
mask, that's why the operation is /1000
.
No, you have to go to 32 bit. In general the product of two 16 bit numbers will always give you a 32 bit wide result.
You should check the CPU instruction set of the CPU you're working on because most multiply instructions on 16 bit machines have an option to return the result as a 32 bit integer directly.
This would help you a lot because:
short testfunction (short a, short b)
{
int A32 = a;
int B32 = b;
return A32*B32/1000
}
Would force the compiler to do a 32bit * 32bit multiply. On your machine this could be very slow or even done in multiple steps using 16bit multiplies only.
A little bit of inline assembly or even better a compiler intrinsic could speed things up a lot.
Here is an example for the Texas Instruments C64x+ DSP which has such intrinsics:
short test (short a, short b)
{
int product = _mpy (a,b); // calculates product, returns 32 bit integer
return product / 1000;
}
Another thought: You're dividing by 1000. Was that constant your choice? It would be much faster to use a power of two as the base for your fixed-point numbers. 1024 is close. Why don't you:
return (a*b)/1024
instead? The compiler could optimize this by using a shift right by 10 bits. That ought to be much faster than doing reciprocal multiplication tricks.
精彩评论