Manipulating values with commas [duplicate]

2023-01-11 23:44 问答作者：

This question already has answers here: Closed 11 years ago.

Possible Duplicate:
How can I declare a thousand separator in read.csv?

I actually have a solution for this problem, but I am curious if there is a better way to do what I was trying to do.

I scraped some data from the majorleaguesoccer.com and read it into R using

mls.reg.tmp <- read.table("../data/mls_reg_season_20100812.csv",
                          header = F, sep = ";")

Note tha开发者_JAVA百科t I used sep = ";" because some of the attendance figures where in the thousands on the websites and I scraped "as is", e.g.,

> str(mls.reg.dat$a_tot)
 Factor w/ 164 levels " 166,060"," 171,282",..: 132 45 159 153 46 160 
158 148 150 98 ...

In hindsight, I should've just removed the commas in python in the pre-processing step of this project. I should also point out that there were some text fields in the data set as well.

> str(mls.reg.dat$team)
 Factor w/ 20 levels "Chicago Fire",..: 4 9 19 11 3 10 13 16 5 6 ...

Given that I want to use the attendance data as a numeric value, I converted using as.numeric and gsub. As an example in a call to ggplot:

ggplot(data = mls.reg.dat, aes(x = as.numeric(gsub(",", "", 
  mls.reg.dat$a_tot)), y = sog)) + geom_point() + 
  facet_wrap(~ team)

Question: Is this the most efficient way of working with data such as this? Or is there a specialized function for doing something along these lines?

I'm posting the question here because I spent quite a bit of time (> 30 min) just working in this simple solution and thought that others might benefit from this as well.

I am not aware of any specialised function, but you could do it directly when you read the data.

  data <- read.table(...)
  data$someColumn <- as.numeric(gsub(",", "", data$someColumn))

Any subsequent call can be made using data$someColumn, without need of further conversion (and easier-to-read code)

EDIT: seems to be duplicate of "How can I declare a thousand separator in read.csv?"

继续阅读：r

Manipulating values with commas [duplicate]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？