Filter a vector of strings based on string matching
I have the following vector:
X <- c("mama.log", "papa.log", "mimo.png", "mentor.log")
How do I retrieve another vector that only contains elements starting with "m" and ending开发者_如何学Python with ".log"?
you can use grepl
with regular expression:
X[grepl("^m.*\\.log", X)]
Try this:
grep("^m.*[.]log$", X, value = TRUE)
## [1] "mama.log" "mentor.log"
A variation of this is to use a glob rather than a regular expression:
grep(glob2rx("m*.log"), X, value = TRUE)
## [1] "mama.log" "mentor.log"
The documentation on the stringr
package says:
str_subset()
is a wrapper aroundx[str_detect(x, pattern)]
, and is equivalent togrep(pattern, x, value = TRUE)
.str_which()
is a wrapper aroundwhich(str_detect(x, pattern))
, and is equivalent togrep(pattern, x)
.
So, in your case, the more elegant way to accomplish your task using tidyverse
instead of base R is as following.
library(tidyverse)
c("mama.log", "papa.log", "mimo.png", "mentor.log") %>%
str_subset(pattern = "^m.*\\.log")
which produces the output:
[1] "mama.log" "mentor.log"
Using pipes...
library(tidyverse)
c("mama.log", "papa.log", "mimo.png", "mentor.log") %>%
.[grepl("^m.*\\.log$", .)]
[1] "mama.log" "mentor.log"
精彩评论