开发者

Convert a "by" object to a data frame in R

I'm using the "by" function in R to chop up a data frame and apply a function to different parts, like this:

pairwise.compare <- function(x) {
Nright <- ...
Nwrong <- ...
Ntied <- ...
return(c(Nright=Nright, Nwrong=Nwrong, Ntied=Ntied))
}
Z.by <- by(rankings, INDICES=list(rankings$Rater, rankings$Class), FUN=pairwise.compare)

The result (Z.by) looks something like this:

: 4 
: 357 
Nright Nwrong Ntied
     3      0     0
------------------------------------------------------------
: 8 
: 357 
NULL
------------------------------------------------------------
: 10 
: 470 
Nright Nwrong Ntied
     3      4     1 
----开发者_StackOverflow社区-------------------------------------------------------- 
: 11 
: 470 
Nright Nwrong Ntied
    12      4     1

What I would like is to have this result converted into a data frame (with the NULL entries not present) so it looks like this:

  Rater Class Nright Nwrong Ntied
1     4   357      3      0     0
2    10   470      3      4     1
3    11   470     12      4     1

How do I do that?


The by function returns a list, so you can do something like this:

data.frame(do.call("rbind", by(x, column, mean)))


Consider using ddply in the plyr package instead of by. It handles the work of adding the column to your dataframe.


Old thread, but for anyone who searches for this topic:

analysis = by(...)
data.frame(t(vapply(analysis,unlist,unlist(analysis[[1]]))))

unlist() will take an element of a by() output (in this case, analysis) and express it as a named vector. vapply() does unlist to all the elemnts of analysis and outputs the result. It requires a dummy argument to know the output type, which is what analysis[[1]] is there for. You may need to add a check that analysis is not empty if that will be possible. Each output will be a column, so t() transposes it to the desired orientation where each analysis entry becomes a row.


This expands upon Shane's solution of using rbind() but also adds columns identifying groups and removes NULL groups - two features which were requested in the question. By using base package functions, no other dependencies are required, e.g., plyr.

simplify_by_output = function(by_output) {
    null_ind = unlist(lapply(by_output, is.null))  # by() returns NULL for combinations of grouping variables for which there are no data. rbind() ignores those, so you have to keep track of them.
    by_df = do.call(rbind, by_output)  # Combine the results into a data frame.
    return(cbind(expand.grid(dimnames(by_output))[!null_ind, ], by_df))  # Add columns identifying groups, discarding names of groups for which no data exist.
}


I would do

x = by(data, list(data$x, data$y), function(d) whatever(d))
array(x, dim(x), dimnames(x))
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜