开发者

Parse an advanced CSV file

I have to load the following CSV file

head1, head2, head3, head4; head5
34 23; 2; "abc";"abc \"sdjh";8
34 23; 2; "abc";"abc 
sdj\;h
jshd";8
34 23; 2; "abc";"abc";8

The function must handle escape characters such as \" \; \n and \r and new l开发者_运维问答ine in the strings. Are there any good library to solve this?


I've had good results using CSV Reader for .Net: http://www.codeproject.com/KB/database/CsvReader.aspx.


That's not a valid CSV file...

The header row would be interpreted as

"head1"," head2"," head3"," head4; head5"

Every other row only has a single column in it.

I don't think any library will be able to handle this out of the box. It looks like the header row has more than one delimiter, and all the other rows might have multiple delimiters too. If you also provided what the actual columns were, it would be easier to help with.

You could give CsvHelper (a library I maintain) a try. It is pretty flexible. You could change the configuration for the headers and rows and make them different. You can set what you want the delimiter and quoted field to be. It also handles line endings of \r, \n, and \r\n even if every line is using a different line ending.


I couldn't get anything to pass all my tests for CSV Parsing, so I ended up writing something simple to do it. AnotherCsvParser

It does everything I need... but should be easy to fork and extend to your needs too.

Given:

 public class ABCD
 {
     public string A;
     public string B;
     public string C;
     public string D;
 }

It assumes the columns are in the order the fields are defined..(but would be easy to extend to read an attribute or something)

This works:

    var output = NigelThorne.CSVParser.ReadCSVAs<ABCD>(
"a,\"b\",c,d\n1,2,3,4\n\"something, with a comma\",\"something \\\"in\\\" quotes\",\" a \\\\ slash \",\n,,\"\n\",");

Such that:

  Assert.AreEqual(4, output.ToArray().Length);
  var row1 = output.ToArray()[0];
  Assert.AreEqual("a", row1.A);
  Assert.AreEqual("b", row1.B);
  Assert.AreEqual("c", row1.C);
  Assert.AreEqual("d", row1.D);

Note: It's probably not very fast with lots of data either.. again not a problem for me.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜