Using regex in R to find strings as whole words (but not strings as part of words)
I'm searching for the right regular expression. The following
t1 = c("IGF2, IGF2AS, INS, TH", "TH", "THZ开发者_JAVA百科H", "ZGTH")
grep("TH",t1, value=T)
returns all elements of t1
, but only the first and second are correct. I just want entries with word/phrase TH
returned?
You need to add word boundary anchors (\b
) around your search strings so only entire words will be matched (i. e. words surrounded by non-word characters or start/end of string, where "word character" means \w
, i.e. alphanumeric character).
Try
grep("\\bTH\\b",t3, value=T)
You can use \<
and \>
in a regexp to match at the beginning/end of the word.
grep ("\\<TH\\>", t1)
etc.
精彩评论