开发者

Most Efficient Method for Reading Input in Ruby

In Ruby, what's the most efficient method for开发者_Go百科 reading giant text files? On the order of 107 lines with 89 bytes/line. Is one method significantly better than another?


I did some benchmarks a while back to see what would be a good way to load a text file. The fastest was to read in blocks of text, then iterate over them using String.lines.

Reading a text file that is 188,593,869 bytes as a baseline:

IO.foreach(ARGV.shift) do |li|
  print li
end

time ruby test.rb root.mbox > /dev/null
# 
# real    0m3.949s
# user    0m3.709s
# sys     0m0.182s

I dump it to /dev/null to remove screen I/O from the timing.

Instead of reading exclusively line-by-line, load it in a big chunk then iterate over the lines:

File.read(ARGV.shift).lines do |l|
  print l
end

time ruby test.rb root.mbox > /dev/null

real    0m3.492s
user    0m3.281s
sys     0m0.209s

That's 0.5 second savings. It also sucked in 188MB of data, which hardly scales well if you have bigger files. The nice thing is you can tell it to load the entire file, which I did, using read() or tell it to limit the read size.

Here's a cleaned up output from wc for the text file for your reference:

lines: 2,465,369
words: 26,466,463
bytes: 188,593,869
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜