How to paste text and variables into a logical expression in R?
I want to paste variables in the logical expression that I am using to subset data, but the subset function does not see them as column names when pasted (either with ot without quotes).
I have a dataframe with columns named col1, col2 etc. I want to subset for the rows in which colx < 0.05
This DOES work:
subsetdata<-subset(dat开发者_StackOverflow社区aframe, col1<0.05)
subsetdata<-subset(dataframe, col2<0.05)
This does NOT work:
for (k in 1:2){
subsetdata<-subset(dataframe, paste("col",k,sep="")<0.05)
}
for (k in 1:2){
subsetdata<-subset(dataframe, noquote(paste("col",k,sep=""))<0.05)
}
I can't find the answer; any suggestions?
You're making this a lot harder than it needs to be by trying to use subset
. Note that ?subset
says the second argument (also named subset) must be an expression and you're not giving it an expression:
> is.expression(paste("col",1:2,sep="")<0.05)
[1] FALSE
You could construct an unevaluated expression then evaluate it as you pass it to subset
, but there are much easier ways. For example: just take advantage of the vectorized nature of the <
operator.
# sample data
set.seed(21)
dataframe <- data.frame(col1=rnorm(10),col2=rnorm(10),col3=1)
logicalCols <- dataframe[,paste("col",1:2,sep="")] < 0.05
# col1 col2
# [1,] FALSE TRUE
# [2,] FALSE FALSE
# [3,] FALSE TRUE
# [4,] TRUE FALSE
# [5,] FALSE FALSE
# [6,] FALSE FALSE
# [7,] TRUE FALSE
# [8,] TRUE FALSE
# [9,] FALSE TRUE
# [10,] TRUE TRUE
ANY <- apply(logicalCols, 1, any) # any colx < 0.05
ALL <- apply(logicalCols, 1, all) # all colx < 0.05
dataframe[ANY,]
dataframe[ALL,]
Here are a couple of options that are closer to the Jasper's approach. First, you could define the column name as a separate variable and then use it to select the variable from the data.frame
as if it were a list
(since a data.frame
is basically a list
):
for(k in 1:2){
colname <- paste("col",k,sep="")
subsetdata <- dataframe[dataframe[[colname]] < 0.05, ]
}
Or you could refer to the column name as such:
subsetdata <- dataframe[dataframe[colname,] < 0.05, ]
Finally, you could use subset
, although you need to provide a logical expression (as pointed out by Joshua Ulrich):
subsetdata <- subset(dataframe, eval(substitute(x < 0.05, list(x = as.name(colname)))))
It's not quite clear to me what you're trying to do but perhaps seeing &
and |
used in a subset
operation would be helpful.
Both col1
and col2
less than 0.05:
subsetdata<-subset(dataframe, col1 < 0.05 & col2 < 0.05)
Either col1
or col2
less than 0.05:
subsetdata<-subset(dataframe, col1 < 0.05 | col2 < 0.05)
Joshua's answer is a great way of doing this more easily over many columns.
精彩评论