开发者

problem with specifying colClasses in read.csv in R

I am trying to specify colClasses in read.csv in an attempt to speed up the reading of csv file. However, I encounter the following problem:

assuming that i have a file called "t.csv":

"a","b"
"x","0"

Then, if I run the following in R:

data <- read.csv('t.csv' , stringsAsFactors=FALSE, check.names=FALSE , comment.char='', colClasses= c('character','numeric') )

I got this error:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  开发者_运维问答: 
  scan() expected 'a real', got '"0"'

At first I thought it was the problem with my quote. But using quote='"' in read.csv didn't help.


Your second column is not numeric as it is quoted -- that makes it text.

So read it as text, then call as.numeric(...) on the column. Or alter the file.


Further to Dirk,

You can simply drop the colClasses argument and the file will read in fine.

data <- read.csv('t.csv' , stringsAsFactors=FALSE, check.names=FALSE , comment.char='')
str(data)

Gives:

> str(data)
'data.frame':   1 obs. of  2 variables:
 $ a: chr "x"
 $ b: int 0
> class(data$b)
[1] "integer"

You should be able to do everything you want with that second column now.

GL

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜