read.delim() - errors "more columns than column names" and "header and ''col.names" are of different lengths"
Preliminary information OS: Windows XP Professional Version 2002 Service Pack 3; R version: R 2.12.2 (2011-02-25)
I am attempting to read a 30,000 row by 80 column, tab-delimited text file into R using the read.delim()
function. This file does have column headers with following naming convention: "_". The code that I use to attempt to read the data in is:
cc <- c("integer", "character", "integer", rep("character", 3),
rep("integer", 73))
example_data <- read.delim(file = 'C:/example.txt', row.names = FALSE,
col.na开发者_如何学编程mes = TRUE, as.is = TRUE, colClasses = cc)
After I submit this command, I receive the following error message:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
In addition: Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
header and 'col.names' are of different lengths
Information that may be important - from column 8 until column 80 the count of zeros in each column is as follows:
column 08: 29,000 zeros
column 13: 15,000 zeros
column 19: 500 zeros
column 43: 15,000 zeros
columns 65-80: 29,000 zeros for each column
Can anyone help identify reasons that I am receiving the above error messages? Any help will be greatly appreciated.
The cause of the problem is your use of the col.names=TRUE
argument. This is supposed to be used manually to specify column names for the resulting data frame, and therefore must be a vector with the same length as there are columns in the input, one name per column.
f you want read.delim
to take column names from the file, consider using header=TRUE
; you may also wish to reconsider row.names=TRUE
as again this is intended as a specification of the row names rather than an instruction to read them from the file.
More information is available on the help page for read.delim
.
I also recently had the same error and it disappeared after converting the file to comma or semicolon delimited and read it with read.csv / read.csv2. I know this is not a fullfillig answer but maybe you might check that out.
If you want to read as character matrix then first convert your file into .csv format and use read.csv. Don't use any other declaration other than file name. e.g.;
read.csv("filepath")
精彩评论