On using plyr and ldply
I have a reoccuring problem - I apologize!
Say I want to have the baseball data (from the plyr package) listed according to 'id' and 'year'. There is a difference between creating the list according to either:
1. mylist1 <- dlply(baseball, .(id, year), identity)
and
2. mylist2 <- dlply(baseball, .(id), dlply, .(year), identity)
in the way the list is开发者_开发百科 organized, but getting the list back into a data frame is working fine with 'mylist1'.
mydf1 <- ldply(mylist1)
but not with 'mylist2'
mydf2 <- ldply(mylist2)
which gives the following error message:
Error in list_to_dataframe(res, attr(.data, "split_label")): Result must be all atomic, or all data frames
I am a newbie to R, and this error message doesn't make much sense to me.
I would like to split my own data frame according to method 2, since I need quite a bit of data manipulation. My question is: how can I merge this list into a data frame? Is there an alternative to do.call(rbind, do.call(rbind,...
?
I am greatful for any help!
I agree with @Andrie that this is an odd structure. But I assume that you have a particular reason for doing it this way.
Since it took two passes with dlply
to create mylist2
, it takes two invocations of ldply
to put it back together.
mydf2 <- ldply(mylist2, ldply)
This restores baseball
(modulo ordering)
> class(mydf2)
[1] "data.frame"
> all(dim(mydf2) == dim(baseball))
[1] TRUE
精彩评论