How to detect and handle different EOL in Ruby?
I am trying to process a CSV file that can either be generated with CF or LF as an EOL marker. When I try to read the file with
infile = File.open('my.csv','r')
whi开发者_如何学JAVAle line = infile.gets
...
The entire 20MB file is read in as one line.
How can I detect and handle properly?
TIA
I would slurp the file, normalize the input, and then feed it to CSV:
raw = File.open('my.csv','rb',&:read).gsub("\r\n","\n")
CSV.parse(raw) do |row|
# use row here...
end
The above uses File.open
instead of IO.read
due to slow file reads on Windows Ruby.
When in doubt, use a regex.
> "how\r\nnow\nbrown\r\ncow\n".split /[\r\n]+/
=> ["how", "now", "brown", "cow"]
So, something like
infile.read.split(/[\r\n]+/).each do |line|
. . .
end
Now, it turns out that the standard library CSV already understands mixed line endings, so you could just do:
CSV.parse(infile.read).each do |line|
. . .
精彩评论