problem with specifying colClasses in read.csv in R
I am trying to specify colClasses in read.csv in an attempt to speed up the reading of csv file. However, I encounter the following problem:
assuming that i have a file called "t.csv":
"a","b"
"x","0"
Then, if I run the following in R:
data <- read.csv('t.csv' , stringsAsFactors=FALSE, check.names=FALSE , comment.char='', colClasses= c('character','numeric') )
I got this error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 开发者_运维问答:
scan() expected 'a real', got '"0"'
At first I thought it was the problem with my quote. But using quote='"' in read.csv didn't help.
Your second column is not numeric
as it is quoted -- that makes it text.
So read it as text, then call as.numeric(...)
on the column. Or alter the file.
Further to Dirk,
You can simply drop the colClasses argument and the file will read in fine.
data <- read.csv('t.csv' , stringsAsFactors=FALSE, check.names=FALSE , comment.char='')
str(data)
Gives:
> str(data)
'data.frame': 1 obs. of 2 variables:
$ a: chr "x"
$ b: int 0
> class(data$b)
[1] "integer"
You should be able to do everything you want with that second column now.
GL
精彩评论