Finding means and medians across data frames in r
I have several data frames, a
b
c
d
, each with the same column names. I want to find the mean and开发者_开发问答 median of those data frames. In other words, construct new mean
and median
data frames that are the same size as a
, b
, etc.
I could use a couple of for
loops, but I bet there is a slick way of doing this using the R built-in functions that would be faster.
Following Josh Ulrich's answer, how about
library(abind)
apply(abind(a,b,c,d,along=3),c(1,2),median)
?
(Using rowMeans
on the appropriate slice will still be faster than apply
ing mean
... I think there is a rowMedians
in the Biobase
(Bioconductor) package if you really need speed?)
I'm not sure JD's answer gives you exactly what you want, since the resulting object wouldn't be the same dimensions as a
, b
, etc.
Putting your data.frames into a list is a good start though. Then you can subset each column into a new list, cbind
that list into a matrix and use apply
over it's rows.
a <- data.frame(rnorm(10), runif(10))
b <- data.frame(rnorm(10), runif(10))
c <- data.frame(rnorm(10), runif(10))
d <- data.frame(rnorm(10), runif(10))
myList <- list(a,b,c,d)
sapply(1:ncol(a), function(j) { # median
apply(do.call(cbind,lapply(myList,`[`,,j)), 1, median)
})
sapply(1:ncol(a), function(j) { # mean
apply(do.call(cbind,lapply(myList,`[`,,j)), 1, mean)
})
sapply(1:ncol(a), function(j) { # faster mean
rowMeans(do.call(cbind,lapply(myList,`[`,,j)))
})
you could string your data frames into a list of data frames, then use lapply(myList, mean, ...)
精彩评论