开发者

trying to display original and fitted data (nls + dnorm) with ggplot2's geom_smooth()

I am exploring some data, so the first thing I wanted to do was try to fit a normal (Gaussian) distribution to it. This is my first time trying this in R, so I'm taking it one step at a time. First I pre-binned my data:

myhist = data.frame(size = 10:27, counts = c(1L, 3L, 5L, 6L, 9L, 14L, 13L, 23L, 31L, 40L, 42L, 22L, 14L, 7L, 4L, 2L, 2L, 1L) )

qplot(x=size, y=counts, data=myhist)

trying to display original and fitted data (nls + dnorm) with ggplot2's geom_smooth()

Since I want counts, I need to add a normalization factor (N) to scale up the density:

fit = nls(counts ~ N * dnorm(size, m, s), data=myhist, start=c(m=20, s=5, N=sum(myhist$counts)) )   

Then I create the fitted data for display and everything works great:

x = seq(10,30,0.2)
fitted = data.frame(size = x, counts=predict(fit, data.frame(size=x)) )
ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + geom_line(data=fitted)

trying to display original and fitted data (nls + dnorm) with ggplot2's geom_smooth()

I got excited when I found this thread which talks about using geom_smooth() to do it all in one step, but I can't get it to work:

  • http://www.mail-archive.com/r-help@r-project.org/msg109882.html

Here's what I try... and what I get:

ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + geom_smooth(metho开发者_开发知识库d="nls", formula = counts ~ N * dnorm(size, m, s), se=F, start=list(m=20, s=5, N=300, size=10))

Error in method(formula, data = data, weights = weight, ...) : 
  parameters without starting value in 'data': counts

The error seems to indicate that it's trying to fit for the observed variable, counts, but that doesn't make any sense, and it predictably freaks out if I specify a "starting" value for counts too:

fitting parameters ‘m’, ‘s’, ‘N’, ‘size’, ‘counts’ without any variables

Error in eval(expr, envir, enclos) : object 'counts' not found

Any idea what I'm doing wrong? It's not the end of the world, of course, but fewer steps are always better, and you guys always come up with the most elegant solutions to these common tasks.

Thanks in advance!

Jeffrey


the first error indicates that ggplot2 cannot find the variable 'count', which is used in formula, in data.

Stats take place after mapping, that is, size -> x, and counts -> y.

Here is an example for using nls in geom_smooth:

ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + 
  geom_smooth(method="nls", formula = y ~ N * dnorm(x, m, s), se=F, 
              start=list(m=20, s=5, N=300)) 

The point is that using x and y, instead of size and counts, in the specification of formula.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜