Stripping non A-Z characters in vector in R
I have a vector of usernames that have non A-Z characters in them.
I want to be able to strip those characters out.
I was told to use letters vector but开发者_如何学C y =x[letters]
doesn't seem to work.
Thanks
If x
is your vector, use a simple pair of range regexes with gsub
and replace all with the empty string. Using ^
gives the negation of the pattern:
gsub("[^a-zA-Z]", "", x)
For example, with some simple data.
gsub("[^a-zA-Z]", "", c(letters, LETTERS, "3s8t7a2c9k:o3v8e7r%F%L^O#W%&^%@#^"))
[1] "a" "b" "c" "d" "e" "f" "g" "h"
[9] "i" "j" "k" "l" "m" "n" "o" "p"
[17] "q" "r" "s" "t" "u" "v" "w" "x"
[25] "y" "z" "A" "B" "C" "D" "E" "F"
[33] "G" "H" "I" "J" "K" "L" "M" "N"
[41] "O" "P" "Q" "R" "S" "T" "U" "V"
[49] "W" "X" "Y" "Z" "stackoverFLOW"
Maybe this does what you want
username <- "user12_AB"
strip_non_letters <- function(s) {
idx <- which(strsplit(tolower(s),"")[[1]] %in% letters)
paste(strsplit(s, "")[[1]][idx], collapse="")
}
strip_non_letters(username)
similar to the above from Karsten, hope not too redundant
usernames <- c("A!ex25","Goerge?","H@rry","Dumbname89")
# a function to cut out non-letters
onlyletters <- function(x){
chars <- unlist(strsplit(x,split=""))
charsout <- chars[chars%in%c(letters,LETTERS)]
paste(charsout,sep="",collapse="")
}
sapply(usernames,onlyletters)
> A!ex25 Goerge? H@rry Dumbname89
> "Aex" "Goerge" "Hrry" "Dumbname"
精彩评论