truncate string from a certain character in R [duplicate]
I have a list of strings in R which looks like:
WDN.TO
WDR.N
WDS.AX
WEC.AX
WEC.N
WED.TO
I want to get all the postfix of the strings starting from the character ".", the result should look like:
.TO
.N
.AX
.AX
.N
.TO
Anyone have any ideas?
Joshua's solution works fine. I'd use sub
instead of gsub
though. gsub
is for substituting multiple occurrences of a pattern in a string - sub
is for one occurrence. The pattern can be simplified a bit too:
> x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
> sub("^[^.]*", "", x)
[1] ".TO" ".N" ".AX" ".AX" ".N" ".TO"
...But if the strings are as regular as in the question, then simply stripping the first 3 characters should be enough:
> x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
> substring(x, 4)
[1] ".TO" ".N" ".AX" ".AX" ".N" ".TO"
Using gsub
:
x <- c("WDN.TO","WDS.N")
# replace everything from the start of the string to the "." with "."
gsub("^.*\\.",".",x)
# [1] ".TO" ".N"
Using strsplit
:
# strsplit returns a list; use sapply to get the 2nd obs of each list element
y <- sapply(strsplit(x,"\\."), `[`, 2)
# since we split on ".", we need to put it back
paste(".",y,sep="")
# [1] ".TO" ".N"
Strsplit might do it but in case the data set is too large it will show an error subscript out of bounds
x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
y <- strsplit(x,".")[,2]
#output y= TO N AX AX N TO
精彩评论