开发者

Creating multiple subsets all in one data.frame (possibly with ddply)

I have a large data.frame, and I'd like to be able to reduce it by using a quantile subset by one of the variables. For exampl开发者_运维百科e:

x <- c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10)
df <- data.frame(x,rnorm(100))

df2 <- subset(df, df$x == 1)
df3 <- subset(df2, df2[2] > quantile(df2$rnorm.100.,0.8))

What I would like to end up with is a data.frame that contains all quantiles for x=1,2,3...10.

Is there a way to do this with ddply?


You could try:

ddply(df, .(x), subset, rnorm.100. > quantile(rnorm.100., 0.8))

And off topic: you could use df <- data.frame(x,y=rnorm(100)) to name a column on-the-fly.


Here's a different approach with the little used ave() command. (very fast to calculate this way)

Make a new column that contains the quantile calculation across each level of x

df$quantByX <-  ave(df$rnorm.100., df$x, FUN = function (x) quantile(x,0.8))

Select the items of the new column and the x column.

df2 <- unique(df[,c(1,3)])

The result is one data frame with the unique items in the x column and the calculated quantile for each level of x.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜