R: manipulating data.frames containing strings and booleans
I have a data.frame in R; it's called p
. Each element in the data.frame is either True or False. My variable p
has, say, m rows and n columns. For every row there is strictly only one TRUE
element.
It also has column names, which are strings. What I would like to do is the following:
- For every row in
p
I see aTRUE
I would like to replace with the name of the corresponding column - I would then like to collapse the data.frame, which now contains
FALSE
s and column names, to a single vector, which will have m elements. - I would like to do this in an R-thonic manner, so as to continue my enlightenment in R and contribute to a world without for-loops.
I can do step 1 using the following for loop:
for (i in seq(length(colna开发者_Go百科mes(p)))) {
p[p[,i]==TRUE,i]=colnames(p)[i]
}
but theres's no beauty here and I have totally subscribed to this for-loops-in-R-are-probably-wrong mentality. Maybe wrong is too strong but they're certainly not great.
I don't really know how to do step 2. I kind of hoped that the sum of a string and FALSE
would return the string but it doesn't. I kind of hoped I could use an OR operator of some kind but can't quite figure that out (Python responds to False or 'bob'
with 'bob'
). Hence, yet again, I appeal to you beautiful Rstats people for help!
Here's some sample data:
df <- data.frame(a=c(FALSE, TRUE, FALSE), b=c(TRUE, FALSE, FALSE), c=c(FALSE, FALSE, TRUE))
You can use apply
to do something like this:
names(df)[apply(df, 1, which)]
Or without apply
by using which
directly:
idx <- which(as.matrix(df), arr.ind=T)
names(df)[idx[order(idx[,1]),"col"]]
Use apply
to sweep your index through, and use that index to access the column names:
> df <- data.frame(a=c(TRUE,FALSE,FALSE),b=c(FALSE,FALSE,TRUE),
+ c=c(FALSE,TRUE,FALSE))
> df
a b c
1 TRUE FALSE FALSE
2 FALSE FALSE TRUE
3 FALSE TRUE FALSE
> colnames(df)[apply(df, 1, which)]
[1] "a" "c" "b"
>
精彩评论