grep at the beginning of the string with fixed =T in R?
How to grep with fixed=T
, but only at the beginning of the string?
grep("a.", c("a.b", "cac", "sss", "ca.f"), fixed = T)
# 1 4
I would like to get only the first occurrence. [Edit: the string to match is not known in advance, and can be anything. "a." is just for the sake of example]
Thanks.
[Edit: I sort of solved it now, but any other ideas are highly welcome. I will accept as an answer any alternative solution.
s <- "a."
res <- grep(s, c("a.b", "cac", "sss", "ca.f"), fix开发者_开发问答ed = T, value = T)
res[substring(res, 1, nchar(s)) == s]
]
If you want to match an exact string (string 1) at the beginning of the string (string 2), then just subset your string 2 to be the same length as string 1 and use ==, should be fairly fast.
Actually, Greg -and you- have mentioned the cleanest solution already. I would even drop the grep altogether:
> name <- "a#"
> string <- c("a#b", "cac", "sss", "ca#f")
> string[substring(string, 1, nchar(name)) == name]
[1] "a#b"
But if you really insist on grep, you can use Dwins approach, or following mindboggling solution:
specialgrep <- function(x,y,...){
grep(
paste("^",
gsub("([].^+?|[#\\-])","\\\\\\1",x)
,sep=""),
y,...)
}
> specialgrep(name,string,value=T)
[1] "a#b"
It might be I forgot to include some characters in the gsub. Be sure you keep the ] symbol first and the - last in the characterset, otherwise you'll get errors. Or just forget about it, use your own solution. This one is just for fun's sake :-)
Do you want to use fixed=T
because of the .
in the pattern? In that case you can just escape the .
this would work:
grep("^a\\.", c("a.b", "cac", "sss", "ca.f"))
If you only want the focus on the first two characters, then only present that much information to grep:
> grep("a.", substr(c("a.b", "cac", "sss", "ca.f"), 1,2) ,fixed=TRUE)
[1] 1
You could easily wrap it into a function:
> checktwo <- function (patt,vec) { grep(patt, substr(vec, 1,nchar(patt)) ,fixed=TRUE) }
> checktwo("a.", c("a.b", "cac", "sss", "ca.f") )
[1] 1
I think Dr. G had the key to the solution in his answer, but didn't explicitly call it out: "^" in the pattern specifies "at the beginning of the string". ("$" means at the end of the string)
So his "^a." pattern means "at the beginning of the string, look for an 'a' followed by one character of anything [the '.']".
Or you could just use "^a" as the pattern unless you don't want to match the one character string containing only "a".
Does that help?
Jeffrey
精彩评论