How do I most effectively prevent my normally-distributed random variable from being zero?

2023-03-14 02:21 问答作者：

I'm writing a Monte Carlo algorithm, in which at one point I need to divide by a random variable. More precisely: the random variable is used as a step width for a difference quotient, so I actually first multiply something by the variable and then again divide it out of some locally linear function of this expression. Like

double f(double);

std::tr1::variate_generator<std::tr1::mt19937, std::tr1::normal_distribution<> >
  r( std::tr1::mt19937(time(NULL)),
     std::tr1::normal_distribution<>(0) );

double h = r();
double a = ( f(x+h) - f(x) ) / h;

This works fine most of the time, but fails when h=0. Mathematically, this is not a concern because in any finite (or, indeed, countable) 开发者_C百科selection of normally-distributed random variables, all of them will be nonzero with probability 1. But in the digital implementation I will encounter an h==0 every ≈2³² function calls (regardless of the mersenne twister having a period longer than the universe, it still outputs ordinary longs!).

It's pretty simple to avoid this trouble, at the moment I'm doing

double h = r();
while (h==0) h=r();

but I don't consider this particularly elegant. Is there any better way?

The function I'm evaluating is actually not just a simple ℝ->ℝ like f is, but an ℝᵐxℝⁿ -> ℝ in which I calculate the gradient in the ℝᵐ variables while numerically integrating over the ℝⁿ variables. The whole function is superimposed with unpredictable (but "coherent") noise, sometimes with specific (but unknown) outstanding frequencies, that's what gets me into trouble when I try it with fixed values for h.

your way seems elegant enough, maybe a little different:

do {
    h = r();
} while (h == 0.0);

The ratio of two normally-distributed random variables is the Cauchy distribution. The Cauchy distribution is one of those nasty distributions with an infinite variance. Very nasty indeed. A Cauchy distribution will make a mess of your Monte Carlo experiment.

In many cases where the ratio of two random variables is computed, the denominator is not normal. People often use a normal distribution to approximate this non-normally distributed random variable because

normal distributions are usually so easy to work with,
usually have such nice mathematical properties,
the normal assumption appears to be more or less correct, and
the real distribution is a bear.

Suppose you are dividing by distance. Distance is semi-positive definite by definition, and is often positive definite as a random variable. So right off the bat distance can never be normally distributed. Nonetheless, people often assume a normal distribution for distance in cases where the mean is much, much larger than the standard deviation. When this normal assumption is made you need to protect against those non-real values. One simple solution is a truncated normal.

If you want to preserve normal distribution you have to either exclude 0 or assign 0 to a new previously non-occurring value. Since the second is most likely not possible in the finite ranges of computer science the first is our only option.

A function (f(x+h)-f(x))/h has a limit as h->0 and therefore if you encounter h==0 you should use that limit. The limit would be f'(x) so if you know the derivative you can use it.

If what you are actually doing is creating number of discrete points though that approximate a normal distribution, and this is good enough for your distribution, create it in a way that none of them will actually have the value 0.

Depending on what you're trying to compute, perhaps something like this would work:

double h = r();
double a;
if (h != 0)
    a = ( f(x+h) - f(x) ) / h;
else
    a = 0;

If f is a linear function, this should (I think?) remain continuous at h = 0.

You might also want to instead consider trapping division-by-zero exceptions to avoid the cost of the branch. Note that this may or may not have a detrimental effect on performance - benchmark both ways!

On Linux, you will need to build the file that contains your potential division by zero with -fnon-call-exceptions, and install a SIGFPE handler:

struct fp_exception { };

void sigfpe(int) {
  signal(SIGFPE, sigfpe);
  throw fp_exception();
}

void setup() {
  signal(SIGFPE, sigfpe);
}

// Later...
    try {
        run_one_monte_carlo_trial();
    } catch (fp_exception &) {
        // skip this trial
    }

On Windows, use SEH:

__try 
{ 
    run_one_monte_carlo_trial();
} 
__except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO ? 
         EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{ 
    // skip this trial
}

This has the advantage of potentially having less effect on the fast path. There is no branch, although there may be some adjustment of exception handler records. On Linux, there may be a small performance hit due to the compiler generating more conservative code for for -fnon-call-exceptions. This is less likely to be a problem if the code compiled under -fnon-call-exceptions does not allocate any automatic (stack) C++ objects. It's also worth noting that this makes the case in which division by zero does happen VERY expensive.

继续阅读：montecarlo normal-distribution random tr1

How do I most effectively prevent my normally-distributed random variable from being zero?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？