开发者

Algorithm for fixed-point multiplication

I'm trying to rescale a timestamp (fractional part of seconds only) from nanoseconds (units of 10^-9 seconds) to the lower half of an NTP timestamp (units of 2^-32 seconds). Effectively this means multiplying by 4.2949673. But I need to do it without floating-point math, and without using integers larger than 32 bits (in fact, I'm actually writing this for an 8-bit microcontroller, so even 32-bit math is expensive, especially divisions).

I've come up with a few algorithms that work reasonably well, but I don't have any real grounding in numerical methods so I'd appreciate any suggestions as to how to improve them, or any other algorithms that would be more accurate and/or faster.

Algorithm 1

uint32_t intts = (ns >> 16) * 281474 + (ns << 16) / 15259 + ns / 67078;

The first two constants were chosen to slightly undershoot, rather than overshoot, the correct figure, and the final factor o开发者_如何学JAVAf 67078 was determined empirically to correct for this. Produces results within +/- 4 NTP units of the correct value, which is +/- 1 ns -- acceptable, but the residual changes with ns. I guess I could add another term.

Algorithm 2

uint32_t ns2 = (2 * ns) + 1;
uint32_t intts = (ns2 << 1)
  + (ns2 >> 3) + (ns2 >> 6) + (ns2 >> 8) + (ns2 >> 9) + (ns2 >> 10)
  + (ns2 >> 16) + (ns2 >> 18) + (ns2 >> 19) + (ns2 >> 20) + (ns2 >> 21)
  + (ns2 >> 22) + (ns2 >> 24) + (ns2 >> 30) + 3;

Based on the binary expansion of 4.2949673 (actually based on the binary expansion of 2.14748365, since I start by doubling and adding one to accomplish rounding). Possibly faster than algorithm 1 (I haven't gotten out the benchmarks yet). The +3 was determined empirically to cancel out undershoot from truncating all those low-order bits, but it doesn't do the best possible job.


uint32_t convert(uint32_t x) {
    const uint32_t chi = 0x4b82;
    const uint32_t clo = 0xfa09;
    const uint32_t round = 0x9525;
    const uint32_t xhi = x >> 16;
    const uint32_t xlo = x & 0xffff;
    uint32_t lowTerm = xlo*clo;
    uint32_t crossTerms = xhi*clo + xlo*chi;
    uint32_t rounded = crossTerms + (lowTerm >> 16) + round >> 16;
    uint32_t highTerm = xhi*chi;
    return (x << 2) + highTerm + rounded;
}

Basic fixed-point multiplication, simulating a 32x32 -> 64 product using four 16x16 -> 32 products. The constant round was chosen to minimize the error using a simple binary search. This expression is good to +/-0.6 NTP over the entire valid range.

The leading 4 in the scale factor is handled in the shift. Compilers typically can generate pretty decent code for this type of thing, but it can often be streamlined with hand-written assembly if you need to.

If you don't need so much accuracy, you can get rid of the lowTerm and round and get an answer that's good to +/-1.15 NTP:

uint32_t convert(uint32_t x) {
    const uint32_t chi = 0x4b82;
    const uint32_t clo = 0xfa09;
    const uint32_t xhi = x >> 16;
    const uint32_t xlo = x & 0xffff;
    uint32_t crossTerms = xhi*clo + xlo*chi;
    uint32_t highTerm = xhi*chi;
    return (x << 2) + highTerm + (crossTerms >> 16) + 1;
}


I might be stating the obvious… but have you googled the interwebz for fixed point math libraries? There's plenty of them. Here's a good one with C++ and x86 implementations on Flipcode's archives:

http://www.flipcode.com/archives/Fixed_Point_Routines.shtml

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜