subselection dataframe
I have a simple questioon I think. In my dataframe I would like to make subset where column Quality_score is equal to: Perfect, Perfect*, Perfect*, Good, Good** and Good***
This in my solution by now:
>Qu开发者_运维知识库ality_scoreComplete <- subset(completefile,Quality_score == "Perfect" | Quality_score=="Perfect***" | Quality_score=="Perfect****" | Quality_score=="Good" | Quality_score=="Good***" | Quality_score=="Good****")
Is there a way to simplify this method? Like:
methods<-c('Perfect', 'Perfect***', 'Perfect****', 'Good', 'Good***','Good***')
Quality_scoreComplete <- subset(completefile,Quality_score==methods)
Thank you all,
Lisanne
You do not even need subset
, check: ?"["
Quality_scoreComplete <- completefile[completefile$Quality_score %in% methods,]
EDITED: based on kind comment of @Sacha Epskamp: ==
in the expression gives wrong results, so corrected it above to %in%
. Thanks!
Example of the problem:
> x <- c(17, 19)
> cars[cars$speed==x,]
speed dist
29 17 32
31 17 50
36 19 36
38 19 68
> cars[cars$speed %in% x,]
speed dist
29 17 32
30 17 40
31 17 50
36 19 36
37 19 46
38 19 68
One thing that works is grepl
, this searches for a pattern in strings and returns a logical indicating if it is there. You can use the |
operator in a string as well to indicate OR, and ignore.case
to ignore case sensitivity:
methods<-c('Perfect', 'Perfect*', 'Perfect*', 'Good', 'Good','Good*')
completefile <- data.frame( Quality_score = c( methods, "bad", "terrible", "abbysmal"), foo = 1)
subset(completefile,grepl("good|perfect",Quality_score,ignore.case=TRUE))
1 Perfect 1
2 Perfect* 1
3 Perfect* 1
4 Good 1
5 Good 1
6 Good* 1
EDIT: I see now that case sensitivity was not an issue, thanks dyslexia! You could simplify then to:
subset(completefile,grepl("Good|Perfect",Quality_score))
精彩评论