R: Question about table reshaping
I have the following data frame:
id,property1,property2,property3
1,1,0,0
2,1,1,0
3,0,0,1
4,1,1,1
d.f <- structure(list(id = 1:4, property1 = c(1L, 1L, 0L, 1L), property2 = c(0L,
1L, 0L, 1L), property3 = c(0L, 0L, 1开发者_StackOverflow社区L, 1L)), .Names = c("id",
"property1", "property2", "property3"), class = "data.frame", row.names = c(NA,
-4L))
What is the least cumbersome way to get the following data frame:
id,properties_list
1,property1
2,property1, property2
3,property3
4,property1, property2, property3
Maybe something like melt
or reshape
with fancy options?
This solution assumes you're looking for a data frame similar to how gsk3 interpreted the question (pasting the properties together) but with the obligatory avoidance of a for
loop, just cause that's how we roll with R:
property_list <- apply(d.f[,-1],1,
FUN=function(x,nms){paste(nms[as.logical(x)],collapse=",")},
nms=colnames(d.f)[-1])
as.data.frame(cbind(d.f$id,property_list))
V1 property_list
1 1 property1
2 2 property1,property2
3 3 property3
4 4 property1,property2,property3
This isn't a reshape at all, really. Use paste
.
for(i in seq(1,3) ) {
tf <- as.logical(d.f[,i+1])
d.f[,i+1] <- as.character(d.f[,i+1])
d.f[,i+1][tf] <- colnames(d.f)[i+1]
d.f[,i+1][!tf] <- " "
}
d.f$property.list <- paste(d.f[,2],d.f[,3],d.f[,4],sep=" ")
As always, you'll get better answers if you dput()
your dataframe first:
d.f <- structure(list(id = 1:4, property1 = c(1L, 1L, 0L, 1L), property2 = c(0L,
1L, 0L, 1L), property3 = c(0L, 0L, 1L, 1L)), .Names = c("id",
"property1", "property2", "property3"), class = "data.frame", row.names = c(NA,
-4L))
That is not actually a proper dataframe which of necessity has all rows with the same number of entries, so the correct answer is you may want a list. If that's not really what you want, then try this:
dfrm[-1] <- t( apply(dfrm[-1], 1, function(x) ifelse(x, names(x), "") ) )
dfrm
id property1 property2 property3
1 1 property1
2 2 property1 property2
3 3 property3
4 4 property1 property2 property3
You need the t() because apply row operations transpose their results because of column-major order that R imposes.
If you do want the list version then here's one approach:
prop_list <- apply(dfrm[-1], 1, function(x) c(names(x)[ as.logical(x)] ) )
names(prop_list) <- dfrm[,1]
prop_list
$`1`
[1] "property1"
$`2`
[1] "property1" "property2"
$`3`
[1] "property3"
$`4`
[1] "property1" "property2" "property3"
精彩评论