regex - return all before the second occurrence
Given this string:
DNS000001320_309.0/121.0_t0
How would I return everything before the second occurrence of "_"?
DNS000001320_309.0/121.0
开发者_Go百科I am using R.
Thanks.
The following script:
s <- "DNS000001320_309.0/121.0_t0"
t <- gsub("^([^_]*_[^_]*)_.*$", "\\1", s)
t
will print:
DNS000001320_309.0/121.0
A quick explanation of the regex:
^ # the start of the input
( # start group 1
[^_]* # zero or more chars other than `_`
_ # a literal `_`
[^_]* # zero or more chars other than `_`
) # end group 1
_ # a literal `_`
.* # consume the rest of the string
$ # the end of the input
which is replaced with:
\\1 # whatever is matched in group 1
And if there are less than 2 underscores, the string is not changed.
I think this might do the task (regex to match everything befor the last occurence of _
):
_([^_]*)$
E.g.:
> sub('_([^_]*)$', '', "DNS000001320_309.0/121.0_t0")
[1] "DNS000001320_309.0/121.0"
Personally, I hate regex, so luckily there's a way to do this without them, just by splitting the string:
> s <- "DNS000001320_309.0/121.0_t0"
> paste(strsplit(s,"_")[[1]][1:2],collapse = "_")
[1] "DNS000001320_309.0/121.0"
Although of course this assumes that there will always be at least 2 underscores in your string, so be careful if you vectorize this and that isn't the case.
not pretty but this will do the trick
mystr <- "DNS000001320_309.0/121.0_t0"
mytok <- paste(strsplit(mystr,"_")[[1]][1:2],collapse="_")
精彩评论