Seeding the Newton iteration for cube root efficiently

2023-04-05 12:00 问答作者：

How can I find the cube root of a number in an efficient way? I think Newton-Raphson method can be used, but I don't know how to guess the initial solut开发者_StackOverflowion programmatically to minimize the number of iterations.

This is a deceptively complex question. Here is a nice survey of some possible approaches.

In view of the "link rot" that overtook the Accepted Answer, I'll give a more self-contained answer focusing on the topic of quickly obtaining an initial guess suitable for superlinear iteration.

The "survey" by metamerist (Wayback link) provided some timing comparisons for various starting value/iteration combinations (both Newton and Halley methods are included). Its references are to works by W. Kahan, "Computing a Real Cube Root", and by K. Turkowski, "Computing the Cube Root".

metamarist updates the DEC-VAX era bit-fiddling technique of W. Kahan with this snippet, which "assumes 32-bit integers" and relies on IEEE 754 format for doubles "to generate initial estimates with 5 bits of precision":

inline double cbrt_5d(double d) 
{ 
   const unsigned int B1 = 715094163; 
   double t = 0.0; 
   unsigned int* pt = (unsigned int*) &t; 
   unsigned int* px = (unsigned int*) &d; 
   pt[1]=px[1]/3+B1; 
   return t; 
}

The code by K. Turkowski provides slightly more precision ("approximately 6 bits") by a conventional powers-of-two scaling on float fr, followed by a quadratic approximation to its cube root over interval [0.125,1.0):

/* Compute seed with a quadratic qpproximation */
fr = (-0.46946116F * fr + 1.072302F) * fr + 0.3812513F;/* 0.5<=fr<1 */

and a subsequent restoration of the exponent of two (adjusted to one-third). The exponent/mantissa extraction and restoration make use of math library calls to frexp and ldexp.

Comparison with other cube root "seed" approximations

To appreciate those cube root approximations we need to compare them with other possible forms. First the criteria for judging: we consider the approximation on the interval [1/8,1], and we use best (minimizing the maximum) relative error.

That is, if f(x) is a proposed approximation to x^{1/3}, we find its relative error:

        error_rel = max | f(x)/x^(1/3) - 1 | on [1/8,1]

The simplest approximation would of course be to use a single constant on the interval, and the best relative error in that case is achieved by picking f_0(x) = sqrt(2)/2, the geometric mean of the values at the endpoints. This gives 1.27 bits of relative accuracy, a quick but dirty starting point for a Newton iteration.

A better approximation would be the best first-degree polynomial:

 f_1(x) = 0.6042181313*x + 0.4531635984

This gives 4.12 bits of relative accuracy, a big improvement but short of the 5-6 bits of relative accuracy promised by the respective methods of Kahan and Turkowski. But it's in the ballpark and uses only one multiplication (and one addition).

Finally, what if we allow ourselves a division instead of a multiplication? It turns out that with one division and two "additions" we can have the best linear-fractional function:

 f_M(x) = 1.4774329094 - 0.8414323527/(x+0.7387320679)

which gives 7.265 bits of relative accuracy.

At a glance this seems like an attractive approach, but an old rule of thumb was to treat the cost of a FP division like three FP multiplications (and to mostly ignore the additions and subtractions). However with current FPU designs this is not realistic. While the relative cost of multiplications to adds/subtracts has come down, in most cases to a factor of two or even equality, the cost of division has not fallen but often gone up to 7-10 times the cost of multiplication. Therefore we must be miserly with our division operations.

static double cubeRoot(double num) {
    double x = num;

    if(num >= 0) {
        for(int i = 0; i < 10 ; i++) {
            x = ((2 * x * x * x) + num ) / (3 * x * x); 
        }
    } 
    return x;
}

It seems like the optimization question has already been addressed, but I'd like to add an improvement to the cubeRoot() function posted here, for other people stumbling on this page looking for a quick cube root algorithm.

The existing algorithm works well, but outside the range of 0-100 it gives incorrect results.

Here's a revised version that works with numbers between -/+1 quadrillion (1E15). If you need to work with larger numbers, just use more iterations.

static double cubeRoot( double num ){
    boolean neg = ( num < 0 );
    double x = Math.abs( num );
    for( int i = 0, iterations = 60; i < iterations; i++ ){
        x = ( ( 2 * x * x * x ) + num ) / ( 3 * x * x ); 
    }
    if( neg ){ return 0 - x; }
    return x;
}

Regarding optimization, I'm guessing the original poster was asking how to predict the minimum number of iterations for an accurate result, given an arbitrary input size. But it seems like for most general cases the gain from optimization isn't worth the added complexity. Even with the function above, 100 iterations takes less than 0.2 ms on average consumer hardware. If speed was of utmost importance, I'd consider using pre-computed lookup tables. But this is coming from a desktop developer, not an embedded systems engineer.

继续阅读：algorithm math

Seeding the Newton iteration for cube root efficiently

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？