开发者

How do I create a string data column that is a transformation of the strings in another column in R?

If I have this data set

Browser          Count
Chrome/11         100
Chrome/11         89
Chrome/13         10
Safari/12         40
Safari/114        30      

And I want to get a more general form of the browser without the version number.

Browser          Clean_Browser     开发者_如何学编程  Count
Chrome/11         Chrome              100
Chrome/11         Chrome              89
Chrome/13         Chrome              10
Safari/12         Safari              40 
Safari/114        Safari              30

I know this is easy to do with python or excel, but is there a way to do it in R so I don't have to pre-process the data?


That is pretty straightforward thanks to the regular expressions as well as string processing --- both are vectorised so you do not need to loop. You could use

  • gsub() et al and replace '/...' with blanks

  • even use strsplit with '/' as the split character and retain the first

  • certainly other ways I can't think of now, and experience suggests several will involve packages by Hadley :) [kidding aside, look at the stringr package too]

Here is approach one, done on a vector but a column in a data.frame is just the same:

R> vec <- c( paste("Chrome", 11:13, sep="/"), paste("Safari", 101:102, sep="/"))
R> vec
[1] "Chrome/11"  "Chrome/12"  "Chrome/13"  "Safari/101" "Safari/102"
R> newvec <- gsub("/.*$", "", vec, perl=TRUE)
R> newvec
[1] "Chrome" "Chrome" "Chrome" "Safari" "Safari"
R> 


You can use colsplit from reshape package to do this.

df = read.table(textConnection(
"Browser          Count
Chrome/11         100
Chrome/11         89
Chrome/13         10
Safari/12         40
Safari/114        30"), sep = "", header = TRUE) 

require(reshape)
browser_version = colsplit(df$Browser, names = c('browser', 'version'), split = '[/]')
df = cbind(df, browser_version)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜