开发者

What are the benefits of the following C code?

Given log(a) and log(b), return log(a+b)

double log_sum(double log_a, double log_b){
    double v;
    if(log_a < log_b){
        v=log_b+log(1+exp(log_a-log_b));
    }
    else{
        v=log_a+log(1+exp(log_b-log_a));
    }
    return v;
}

I want to know what are the benefits of the above func开发者_运维知识库tion?


The primary (brute force) alternative looks like:

v = log(exp(log_a) + exp(log_b));

That has three transcendental function evaluations.

The computation shown uses just two transcendental functions - and should be quicker.

It may also be more stable numerically.


Computers and logs don't always get along. As has been alluded to by others, accuracy becomes a real problem. This blog post goes a long way toward explaining this phenomenon. The article is about seemingly unnecessary library functions, and why they are actually very handy.

The function log1p computes log(1 + x). How hard could this be to implement?

There are all sorts of crazy rules and transforms you can use when dealing with logs/exponentials. What I'm guessing is that the author used some of these rules to either make the calculation more accurate, more efficient, or both.


Others have already mentioned the potential loss of precision, but in this case the problem is really overflows. Try this:

double log_a = 100;
double log_b = 1000;
printf("%f\n", log_b+log(1+exp(log_a-log_b)));
printf("%f\n", log_a+log(1+exp(log_b-log_a)));

On a typical platform, the first one will print "inf" while the second one will print "1000.000000".


If you mean as opposed to log(exp(log_a) + exp(log_b)), then the benefit is pretty clear; the way you mention only has to calculate one log and one exp, while this way has to calculate two exps. That is far more costly than an extra addition/subtraction/if test.


Not really an answer, but a possible clue.

Numbers kept in "log form" can be multiplied or divided by simply adding or subtracting the numbers. For example, exp(log(a) + log(b)) is the same as a * b. Or, using a = 41, b = 101, this would be exp(3.71357 + 4.61512), which is exp(8.32869), or 4140.98930. Obviously precision plays a part, and I truncated the numbers to 5 digits. 41 * 101 is 4141.

I haven't worked through your example code, nor is it immediately obvious to me why your code does things the way it does, but hopefully the above will be able to help you piece it together.

EDIT: I ran some numbers through your example code. If a = 41 and b = 101, and log_a = 3.71357 and log_b = 4.61512, then your example code computes 4.95582, and exp(4.95582) is equal to 142.0. The "simpler" way to get this same result is log(exp(log_a) + exp(log_b)), but as others have pointed out, this way involves three expensive transcendental functions, whereas your example code requires only two (plus a trivial comparison).


I guess you ask why this function is better than directly computing log(a+b) by restoring a and b, summing them and computing log():

 log( exp( log_a ) + exp( log_b ) )

In that case you need to compute exponent twice, and in the function you ask about exponent is computed only once. Since computing exponent is relatively time consuming that might be faster.


Others have posted good answers for why to do this at all. I'm wondering about the if/else part. Regardless of whether log_a or log_b is bigger, both expressions for v should be equivalent to log(a+b). In each case 0 < exp( ... ) <= 1, and log(1+exp( ... )) is a small positive number. For some reason I don't know this must be good.


If you're asking about the if/else, it's to avoid loss of precision. All arithmetic operations on floating point numbers (except for multiplication by powers of 2 and certain cases of addition/subtraction of numbers of the same exponent) destroy information, and good floating point code will choose the method with the least loss of precision.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜