regex to parse csv
I'm looking for a regex that will parse a line at a time from a csv file. basically, what string.readline() does, but it will allow开发者_运维技巧 line breaks if they are within double quotes.
or is there an easier way to do this?
Using regex to parse CSV is fine for simple applications in well-controlled CSV data, but there are often so many gotchas, such as escaping for embedded quotes and commas in quoted strings, etc. This often makes regex tricky and risky for this task.
I recommend a well-tested CSV module for your purpose.
--Edit:-- See this excellent article, Stop Rolling Your Own CSV Parser!
The FileHelpers library is pretty good for this purpose.
http://www.filehelpers.net/
Rather than relying on error prone regular expressions, over simpified "split" logic or 3rd party components, use the .NET framework's built in functionality:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\MyFile.csv")
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
Dim MyDelimeters(0 To 0) As String
Reader.HasFieldsEnclosedInQuotes = False
Reader.SetDelimiters(","c)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using
精彩评论