Count the length (number of lines) of a CSV file?
I have a form (Rails) which allows me to load a .csv file using the file_field
.
In the view:
<% form_for(:upcsv, :html => {:multipart => true}) do |f| %>
<table>
<tr>
<td><%= f.label("File:") %></td>
<td><%= f.file_field(:filename) %></td>
</tr>
</table>
<%= f.submit("Submit") %>
<% end %>
Clicking Submit redirects me to another page (create.html.erb). The file was loaded fine, and I was able to read the contents just fine in this second page. I am trying to show the number of lines in the .csv file in this second page.
My controller (semi-pseudocode):
class UpcsvController < ApplicationController
def index
end
def create
file = params[:upcsv][:filename]
...
#params[:upcsv][:file_length] = file.length # Show number of lines in the file
#params[:upcsv][:file_length] = file.size
...
end
end
Both file.length
and file.size
returns '91' when my file only contains 7 lines. From the Rails documentation that I read, once the Submit button is clicked, Rails creates a temp file of th开发者_JAVA技巧e uploaded file, and the params[:upcsv][:filename]
contains the contents of the temp/uploaded file and not the path to the file. And I don't know how to extract the number of lines in my original file. What is the correct way to get the number of lines in the file?
My create.html.erb:
<table>
<tr>
<td>File length:</td>
<td><%= params[:upcsv][:file_length] %></td>
</tr>
</table>
I'm really new at Rails (just started last week), so please bear with my stupid questions.
Thank you!
Update: apparently that number '91' is the number of individual characters (including carriage return) in my file. Each line in my file has 12 digits + 1 newline = 13. 91/13 = 7.
All of the solutions listed here actually load the entire file into memory in order to get the number of lines. If you're on a Unix-based system a much faster, easier and memory-efficient solution is:
`wc -l #{your_file_path}`.to_i
.length and .size are actually synonyms. to get the rowcount of the csv file you have to actually parse it. simply counting the newlines in the file won't work, because string fields in a csv can actually have linebreaks. a simple way to get the linecount would be:
CSV.read(params[:upcsv][:filename]).length
another way to read the number of lines is
file.readlines.size
CSV.foreach(file_path, headers: true).count
Above will exclue header while counting rows
CSV.read(file_path).count
your_csv.count
should do the trick.
If your csv file doesn't fit to memory (can't use readlines), you can do:
def self.line_count(f)
i = 0
CSV.foreach(f) {|_| i += 1}
i
end
Unlike wc -l
this counts actual record count, not number of lines. These can be different if there are new lines in field values.
Just to demonstrate what IO#readlines does:
if you had a file like this: "asdflkjasdlkfjsdakf\n asdfjljdaslkdfjlsadjfasdflkj\n asldfjksdjfa\n"
in rails you'd do, say:
file = File.open(File.join(Rails.root, 'lib', 'file.json'))
lines_ary = IO.readlines(file)
lines_ary.count #=> 3
IO#readlines converts a file into an array of strings using the \n (newlines) as separators, much like commas so often do, so it's basically like
str.split(/\n/)
In fact, if you did
x = file.read
this
x.split(/\n/)
would do the same thing as file.readlines
** IO#readlines can be really handy when dealing with files which have a repeating line structure ("child_id", "parent_ary", "child_id", "parent_ary",...) etc
精彩评论