开发者

remove unwanted chr(13) from csv string with classic asp (vbscript)

I want to create a classic asp (vbscript) function that replaces all 'returns' that occur between double quotes.

开发者_如何学CThe input string is 'csv' like:

ID;Text;Number
1;some text;20
2;"some text with unwanted return
";30
3;some text again;40

I want to split the string on chr(13) (returns) to create single rows in an array. It works well, except for the unwanted chr(13) that is contained in the text of id 2.

I hope someone could help.


Fundamentally, this is going to be difficult to do as you won't be able to tell whether the carriage return is a valid one or not. Clearly the ones after 20 and 30 are valid.

An approach I would would be to scan through each line in the file and count the commas that occur. If it's less than 3, then append the next line and use the concatenated string. (This of course assumes your CSV structure is consistent and fixed).

What I would really be asking here is why is the CSV like this in the first place? The routine that populates this should really be the one stripping the the CRs out.

Think of a CSV file like a very crude database or spreadsheet. When cosidering the above file, it is clear that the 'Database'/'Spreadsheet' is corrupt.

If the program that generates this is correupting it, then what extent should the reading application goto to correct these defects? I'm not sure that Excel or SQL Server (for example) would go to great lengths to correct a corrupt data source.


Your text file is just like a CSV file but with semicolons not commas. Use ADO to grab the data and it will handle the line breaks in fields.

Specifically (In ASP VBScript):

On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = Server.CreateObject("ADODB.Connection")
Set objRecordSet = Server.CreateObject("ADODB.Recordset")

strPathtoTextFile = server.mappath(".")   'Path to your text file

objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
         "Data Source=" & strPathtoTextFile & ";" & _
         "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT * FROM test.txt", _
         objConnection, adOpenStatic, adLockOptimistic, adCmdText

Do Until objRecordset.EOF
    Response.Write "ID: " & objRecordset.Fields.Item("ID") & "<br>"
    Response.Write "Text: " & objRecordset.Fields.Item("Text") & "<br>"
    Response.Write "Number: " & objRecordset.Fields.Item("Number") & "<br>"
    objRecordset.MoveNext
Loop

Code sample is modified from Microsofts' Much ADO About Text Files.

This script assumes your data text file is in the same directory as it (the asp file). It also needs a schema.ini file in the same directory as your data text file with the data:

[test.txt]
Format=Delimited(;)

Change text.txt in both code samples above to the name of your text file.


If the unwanted CRLF always occurs inside a text field (inside double quotes), it would not be very difficult to use a regular expression to remove these. Vbscript has a regex engine to its disposal: http://authors.aspalliance.com/brettb/VBScriptRegularExpressions.asp It all depends ofcourse on how familiar you are with Regular Expressions. I couldn't think of the proper syntax off the top of my head, but this is probably quite easy to figure out.


The solution is pretty easy:

str = "Some text..." & chr(13)
str = REPLACE(str,VbCrlf,"")

The secret is use VbCrlf. For me I use a simple function for solve the problem and add this in my framework.

FUNCTION performStringTreatmentRemoveNewLineChar(byval str)

    IF isNull(str) THEN
        str = ""
    END IF
    str = REPLACE(str,VbCrlf,"")
    performStringTreatmentRemoveNewLineChar = TRIM(str)

END FUNCTION

Of course this will remove all new lines character from this string. Use carrefully.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜