开发者

CSV row reader in Java

I am basic level Java programmer. I am working with CSV files. I have a file that has rows and columns as follows:

     col1   col2   col3
row1
row2
row3

I read this file and stored it in a String. I explode s开发者_运维百科tring with line break to get every row. I have a variable ArrayList that has some row names. How can I make comparison that it only return me specific rows?


Correctly parsing CSV files is more tricky than it might seem at the first sight, you'd need at least:

  • Honour the original text encoding
  • Make sure you can import escaped delimiters, i.e.: 23,10/02/2010,"hello, world",34.5
  • Apply correct date format and decimal point format depending on the file locale
  • Treat the quotes correctly

If it's a quick task I suggest using an existing library, there is at least two open-source CSV libs for Java with a very similar API:

  1. Java CSV Library
  2. OpenCSV

I've tried both starting with OpenCSV and it threw a OutOfMemory exception when just evaluating a file line by line since I had a 600MB CSV file. Apparently there is a memory leak in the current lib.

I didn't have time to debug, so I just switched to Java CSV since the have surprisingly similar API's for basic operations and it worked like a charm.

Java CSV would allow you accessing columns either by index or column name (in case there's a header within the file).

UPDATE

Using Java CSV Lib you'll have to do something along these lines to access individual rows (quick'n'dirty, might not compile):

import com.csvreader.CsvReader;

class Parser {

    public static void main (String [] args) throws Throwable {

       CsvReader reader = new CsvReader("input file name.csv",
                                        ',' /* delimiter */ );

       while (reader.readRecord()) {

            // full row, you can use regex to find 
            // any rows you specifically want
            String row = reader.getRawRecord();  

            // get value of the first field
            String col = reader.get(0);          

            // gets array of fields
            String[] cols[] = reader.getValues();            
       }

       reader.close();

    }

}


The best way to treat this is to make a new entry for each row and store those rows in something like a Vector<Row>

split each new string into a Row object with fields like Row.col1, Row.col2 ... (please choose better names =P)

then you can iterate over the Vector and select only those relevant to you


First of all, you are better off reading each line separately rather than the whole file as a string. Since this is a text file, you can read line by line. Google for something like "Java read file by lines" and you'll find a lot of examples.

Now, in each row, you can break the line into components by spaces or by commas. You said it is a CSV file so I would often expect to see commas, so that you can deal with empty cells.

If you read the first line (the column titles) and store the location of each column as a Map for an array, you would then be able to find the appropriate value in each subsequent row. You could, for example, represent each row as a map from column name to cell value.

I'm not clear what you mean by "How can I make comparison that it only return me specific rows? ", but it sounds like you want to filter rows and print them out. In that case, there is no need to store anything in memory except for the current row, you just iterate row by row and print if it passes whatever checks you want to apply. If you do want to store all of the rows, use something like a vector or a list, but be aware that you may not have enough memory if this is a very large file and many rows pass your check.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜