Ruby: cannot parse Excel file exported as CSV in OS X
I'm using Ruby's CSV library to parse some CSV. I have a seemingly well-formed CSV file that I created by exporting an Excel file as CSV.
However CSV.open(filename, 'r')
causes a CSV::IllegalFormatError
.
There are no rogue commas or quotation marks in the file, nor anything else that I can see that might cause problems.
I suspect the problem could be to do 开发者_如何学JAVAwith line endings. I am able to parse data entered manually via a text editor (Aquamacs). It is just when I try with data exported from Excel (for OS X) that problems occur. When I open up the exported CSV in vim, all the text appears on one line, with ^M
appearing between lines.
From the docs, it seems that you can provide open
with a row separator; however I am unsure what it should be in this case.
Try: CSV.open('filename', 'r', ?,, ?\r)
As cantlin notes, for Ruby 2 it's:
CSV.new('file.csv', 'r', :col_sep => ?,, :row_sep => ?\r)
I'm pretty sure these will DTRT for you. You can also "fix" the file itself (in which case keep the old open
) with the following vim
command: :%s/\r/\r/g
Yes, I know that command looks like a total no-op, but it will work.
Stripping \r characters seemed to work for me
CSV.parse(File.read('filename').gsub(/\r/, '')) do |row|
...
end
Another option is to open the CSV file or the original spreadsheet in Excel and save it as "Windows Comma Separated" rather than "Comma Separated Values". This will output the file with line endings that FasterCSV is able to understand.
""" When I open up the exported CSV in vim, all the text appears on one line, with ^M appearing between lines.
From the docs, it seems that you can provide open with a row separator; however I am unsure what it should be in this case. """
Read back a sentence ... ^M means keyboard Ctrl-M aka '\x0D' (M is the 13th letter of the ASCII alphabet; 0x0D == 13) aka ASCII CR (carriage return) aka '\r' ... IOW what Macs used to use as a line terminator before OS X.
It seems newer versions of the CSV parser and/or any component it uses read DOS/Windows line endings without issues. Mac OS X's stock one (not sure the version) was not cutting it, installed Ruby 2.0.0 and it parsed the file just fine, without the special arguments...
I had similar problem. I got an error:
"error_message"=>"Illegal quoting in line 1.", "error_class"=>"CSV::MalformedCSVError"
The problem was the file had Windows line endings, which are of course other than Unix. What helped me was defining row_sep: "\r\n":
CSV.open(path, 'w', headers: :first_row, col_sep: ';', quote_char: '"', row_sep: "\r\n")
精彩评论