开发者

ggplot2 - referecing summary statistics / layers

I've picked-up the ggplot2 book but I'm struggling to understand how data persists through layers.

For example, lets take a dataset and calculate the mean of each X:

thePlot = ggplot( myDF , aes_string( x = "IndepentVar" , y = "开发者_如何转开发DependentVar" ) )
thePlot = thePlot + stat_summary( fun.y = mean , geom = "point" )

How do I "access" the summary statistics in the next layer? For example, lets say I want to plot a smooth line over the dataset. This seems to work:

thePlot = thePlot + stat_smooth( aes( group = 1 ) , method = "lm" , geom = "smooth" , se = FALSE )

But lets say I want to further ignore a particular X value when generating the line? How do I reference the summarized dataset to express excluding a particular X?

More generally, how is data referenced as it flows through layers? Am I always limited to the last statistics? Can I reference the original dataset?


Here is an attempt at answering your question

  1. The aesthetics defined in the ggplot call, get used as defaults in all subsequent layers if they are not explicitly defined. That is the reason geom_smooth works
  2. You can specify the data frame and aesthetics for each layer separately. For example if you want to exclude some values of x while plotting geom_smooth, you can specify subset = .(x != xvalues) inside the geom_smooth call

I can provide more detailed examples, if you have specific questions.

Hope this helps

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜