开发者

Count the length (number of lines) of a CSV file?

I have a form (Rails) which allows me to load a .csv file using the file_field. In the view:

    <% form_for(:upcsv, :html => {:multipart => true}) do |f| %>
    <table>
        <tr>
            <td><%= f.label("File:") %></td>
            <td><%= f.file_field(:filename) %></td>
        </tr>
    </table>
        <%= f.submit("Submit") %>
    <% end %>

Clicking Submit redirects me to another page (create.html.erb). The file was loaded fine, and I was able to read the contents just fine in this second page. I am trying to show the number of lines in the .csv file in this second page.

My controller (semi-pseudocode):

class UpcsvController < ApplicationController
    def index
    end

    def create
        file = params[:upcsv][:filename]
        ...
        #params[:upcsv][:file_length] = file.length # Show number of lines in the file
        #params[:upcsv][:file_length] = file.size
        ...
    end
end

Both file.length and file.size returns '91' when my file only contains 7 lines. From the Rails documentation that I read, once the Submit button is clicked, Rails creates a temp file of th开发者_JAVA技巧e uploaded file, and the params[:upcsv][:filename] contains the contents of the temp/uploaded file and not the path to the file. And I don't know how to extract the number of lines in my original file. What is the correct way to get the number of lines in the file?

My create.html.erb:

<table>
    <tr>
        <td>File length:</td>
        <td><%= params[:upcsv][:file_length] %></td>
    </tr>
</table>

I'm really new at Rails (just started last week), so please bear with my stupid questions.

Thank you!

Update: apparently that number '91' is the number of individual characters (including carriage return) in my file. Each line in my file has 12 digits + 1 newline = 13. 91/13 = 7.


All of the solutions listed here actually load the entire file into memory in order to get the number of lines. If you're on a Unix-based system a much faster, easier and memory-efficient solution is:

`wc -l #{your_file_path}`.to_i


.length and .size are actually synonyms. to get the rowcount of the csv file you have to actually parse it. simply counting the newlines in the file won't work, because string fields in a csv can actually have linebreaks. a simple way to get the linecount would be:

CSV.read(params[:upcsv][:filename]).length


another way to read the number of lines is

file.readlines.size


CSV.foreach(file_path, headers: true).count

Above will exclue header while counting rows

CSV.read(file_path).count


your_csv.count should do the trick.


If your csv file doesn't fit to memory (can't use readlines), you can do:

def self.line_count(f)
  i = 0
  CSV.foreach(f) {|_| i += 1}
  i
end

Unlike wc -l this counts actual record count, not number of lines. These can be different if there are new lines in field values.


Just to demonstrate what IO#readlines does:

if you had a file like this: "asdflkjasdlkfjsdakf\n asdfjljdaslkdfjlsadjfasdflkj\n asldfjksdjfa\n"

in rails you'd do, say:

file = File.open(File.join(Rails.root, 'lib', 'file.json'))
lines_ary = IO.readlines(file)
lines_ary.count #=> 3

IO#readlines converts a file into an array of strings using the \n (newlines) as separators, much like commas so often do, so it's basically like

str.split(/\n/)

In fact, if you did

 x = file.read

this

 x.split(/\n/)

would do the same thing as file.readlines

** IO#readlines can be really handy when dealing with files which have a repeating line structure ("child_id", "parent_ary", "child_id", "parent_ary",...) etc

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜