Why does this code not optimize for all three points?

2023-01-28 22:07 问答作者：

Background

I am trying to fit a distribution to a 95% CI and mode. The cost function that I am using solves three functions for 0: P(X=2.5 | mu, sigma)=0.025, P(X=7.5|mu, sigma)=0.975, and the mode of log-N(mu, sigma) = 3.3. note: mode of a lognormal is = $e^{\mu-\sigma^2)}$:

Approach

First I write a cost function, prior

prior <- function(parms) {
  a <- abs(plnorm(2.5, parms[1], parms[2]) - 0.025)
  b <- abs(plnorm(7.5, parms[1], parms[2]) - 0.975)
  mode <- exp(parms[1] - parms[2]^2)
  c <- abs(mode-3.3)
  return(a + b + c)
}

And then I seek parameters that minimize the cost function

v = nlm(prior,c(log(3.3),0.14))

It is apparent that the function is maximized for the mode an LCL but not the UCL.

abs(plnorm(7.5, parms[1], parms[2]) - 0.975)
> [1] 0.02499989

Here is a plot with dotted lines at the desired 95%CI:

x <- seq(0,10,0.1)
plot(x,dlnorm(x, v$estimate[1],v$estimate[2]),type='l')
ablin开发者_运维知识库e(v=c(2.5,7.5), lty=2) #95%CI

Question

The optimization two points closely and all of the error is in the third. However, I would like it to fit the points evenly.

How can I get the function to give equal weight to the magnitude of the a, b, and c terms? It appears that the function is only fitting a and c.

note: This question is based on one asked at [cross validated][1] except that this version is specifically about the function of R's nlm() optimization algorithm whereas the CV question is about finding the a more appropriate distribution.

The reason your optimization "does not work" is that the scale of the three parameters, a, b, and c does not match. a and b measure a difference in probabilities, and can always be set to be no larger then 0.025 by chosing a really small value for the standard deviation (parms[2]), since then plnorm(2.5, parms[1], parms[2]) will be 0 (same for 7.5). The same amount of error (0.025) would be unnoticable for c - this is the scaling mismatch.

You can rewrite your optimization function so that the errors are measured on the x scale for all three criteria by comparing the quantiles to 2.5 and 7.5:

prior2 <- function(parms) { 
  a <- abs(qlnorm(0.025, parms[1], parms[2]) - 2.5) 
  b <- abs(qlnorm(0.975, parms[1], parms[2]) - 7.5) 
  mode <- exp(parms[1] - parms[2]^2) 
  c <- abs(mode-3.3) 
  return(a + b + c) 
}

This is similar to what Ramnath suggested, except not on the log scale. This approach does not really do well on the left tail because the distribution is skewed right: small changes in the location of the lower 2.5th percentile lead to large changes of the percentile at 2.5, while this is not the case at 7.5. Ramnath's suggestion of working on the log scale solves this problem, since the log-normal distriburion is symmetric on the log-scale.

Another way to improve your fit is to change the optimization criterion. Right now you are minimizing the average absolute error. This means that one large error is OK-ish as long as the other two error terms are really small. You can impose a bigger penalty on large errors by minimizing the mean squared error (a^2+b^2_c^2) instead. This latest version (on the log scale) produces the best-looking estimate from my point of view.

prior3 <- function(parms) { 
  a <- abs(parms[1] - 1.96*parms[2] - log(2.5))
  b <- abs(parms[1] + 1.96*parms[2] - log(7.5)) 
  c <- abs(parms[1] - parms[2]^2 - log(3.3)) 
  return(a^2 + b^2 + c^2) 
}

try an alternate formulation of your optimization function. the log of the 95% confidence interval for the lognormal distribution is given by mu - 2*sigma and mu + 2*sigma. so you can basically try to minimize abs(mu - 2*sigma - log(2.5)) + abs(mu + 2*sigma - log(7.5)) + abs(mu - sigma^2 - log(3.3)).

when i minimized this, i find that the confidence intervals are fit very closely, while the mode is a little off. depending on the nature of your application, you might want to weight the three terms differently

继续阅读：optimization probability r

Why does this code not optimize for all three points?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？