Can I avoid using data frames in ggplot2?

2022-12-17 11:18 问答作者：

I'm running a monte-carlo simulation and the output is in the form:

> d = data.frame(iter=seq(1, 2), k1 = c(0.2, 0.6), k2=c(0.3, 0.4))
> d
iter  k1   k2
1     0.2  0.3
2     0.6  0.4

The plots 开发者_如何学GoI want to generate are:

plot(d$iter, d$k1)
plot(density(d$k1))

I know how to do equivalent plots using ggplot2, convert to data frame

new_d = data.frame(iter=rep(d$iter, 2), 
                   k = c(d$k1, d$k2), 
                   label = rep(c('k1', 'k2'), each=2))

then plotting is easy. However the number of iterations can be very large and the number of k's can also be large. This means messing about with a very large data frame.

Is there anyway I can avoid creating this new data frame?

Thanks

Short answer is "no," you can't avoid creating a data frame. ggplot requires the data to be in a data frame. If you use qplot, you can give it separate vectors for x and y, but internally, it's still creating a data frame out of the parameters you pass in.

I agree with juba's suggestion -- learn to use the reshape function, or better yet the reshape package with melt/cast functions. Once you get fast with putting your data in long format, creating amazing ggplot graphs becomes one step closer!

Yes, it is possible for you to avoid creating a data frame: just give an empty argument list to the base layer, ggplot(). Here is a complete example based on your code:

library(ggplot2)

d = data.frame(iter=seq(1, 2), k1 = c(0.2, 0.6), k2=c(0.3, 0.4))
# desired plots:
# plot(d$iter, d$k1)
# plot(density(d$k1))

ggplot() + geom_point(aes(x = d$iter, y = d$k1))
# there is not enough data for a good density plot,
# but this is how you would do it:
ggplot() + geom_density(aes(d$k1))

Note that although this allows for you not to create a data frame, a data frame might still be created internally. See, e.g., the following extract from ?geom_point:

All objects will be fortified to produce a data frame.

You can use the reshape function to transform your data frame to "long" format. May be it is a bit faster than your code ?

R> reshape(d, direction="long",varying=list(c("k1","k2")),v.names="k",times=c("k1","k2"))
     iter time   k id
1.k1    1   k1 0.2  1
2.k1    2   k1 0.6  2
1.k2    1   k2 0.3  1
2.k2    2   k2 0.4  2

So just to add to the previous answers. With qplot you could do

p <- qplot(y=d$k2, x=d$k1)

and then from there building it further, e.g. with

p + theme_bw()

But I agree - melt/cast is genereally the way forward.

Just pass NULL as the data frame, and define the necessary aesthetics using the data vectors. Quick example:

library(MASS)
library(tidyverse)
library(ranger)

rf <- ranger(medv ~ ., data = Boston, importance = "impurity")

rf$variable.importance

ggplot(NULL, aes(x = fct_reorder(names(rf$variable.importance), rf$variable.importance),
                 y = rf$variable.importance)) +
    geom_col(fill = "navy blue", alpha = 0.7) +
    coord_flip() +
    labs(x = "Predictor", y = "Importance", title = "Random Forest") +
    theme_bw()

Can I avoid using data frames in ggplot2?

继续阅读：ggplot2 r

Can I avoid using data frames in ggplot2?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？