escaping tricky string to CSV format
I have to create a CSV file from webservice output and the CSV file uses quoted strings with comma separator. I cannot change the format...
So if I have a string
it becomes a "string"
...
If the value has quotes already they are replaced with double quotes.
For example a str"ing
becomes "str""ing"
...
However, lately my import has been failing because of the following
- original input string is:
"","word1,word2,..."
- every single quote is replaced by double resulting in:
"""",""word1,word2,...""
- then its prefixed and suffixed with quote before written to CVS file:
""""",""word1,word2,..."""
As you can see the final result is this:
""""",""word1,word2,..."""
which breaks my import (is sees it as another field)...
I think the issue is appereance of ","
in the original input string.
Is there a CVS escape sequence for this scenario?
Update
The reason why above breaks is due to BCP mapping file (BCP utility is used to load CSV file into SQL db) which has terminator defined as ","
. So inste开发者_如何学编程ad of seeing 1 field it sees 2...But I cannot change the mapping file...
I use this code and it has always worked:
/// <summary>
/// Turn a string into a CSV cell output
/// </summary>
/// <param name="str">String to output</param>
/// <returns>The CSV cell formatted string</returns>
public static string StringToCSVCell(string str)
{
bool mustQuote = (str.Contains(",") || str.Contains("\"") || str.Contains("\r") || str.Contains("\n"));
if (mustQuote)
{
StringBuilder sb = new StringBuilder();
sb.Append("\"");
foreach (char nextChar in str)
{
sb.Append(nextChar);
if (nextChar == '"')
sb.Append("\"");
}
sb.Append("\"");
return sb.ToString();
}
return str;
}
Based on Ed Bayiates' answer:
/// <summary>
/// Turn a string into a CSV cell output
/// </summary>
/// <param name="value">String to output</param>
/// <returns>The CSV cell formatted string</returns>
private string ConvertToCsvCell(string value)
{
var mustQuote = value.Any(x => x == ',' || x == '\"' || x == '\r' || x == '\n');
if (!mustQuote)
{
return value;
}
value = value.Replace("\"", "\"\"");
return string.Format("\"{0}\"", value);
}
My penny thought:
String[] lines = new String[] { "\"\",\"word\",word,word2,1,34,5,2,\"details\"" };
for (int j = 0; j < lines.Length; j++)
{
String[] fields=lines[j].Split(',');
for (int i =0; i<fields.Length; i++)
{
if (fields[i].StartsWith("\"") && fields[i].EndsWith("\""))
{
char[] tmp = new char[fields[i].Length-2];
fields[i].CopyTo(1,tmp,0,fields[i].Length-2);
fields[i] =tmp.ToString();
fields[i] = "\""+fields[i].Replace("\"","\"\"")+"\"";
}
else
fields[i] = fields[i].Replace("\"","\"\"");
}
lines[j]=String.Join(",",fields);
}
Based on contribution of "Ed Bayiates" here's an helpful class to buid csv document:
/// <summary>
/// helpful class to build csv document
/// </summary>
public class CsvBuilder
{
/// <summary>
/// create the csv builder
/// </summary>
public CsvBuilder(char csvSeparator)
{
m_csvSeparator = csvSeparator;
}
/// <summary>
/// append a cell
/// </summary>
public void appendCell(string strCellValue)
{
if (m_nCurrentColumnIndex > 0) m_strBuilder.Append(m_csvSeparator);
bool mustQuote = (strCellValue.Contains(m_csvSeparator)
|| strCellValue.Contains('\"')
|| strCellValue.Contains('\r')
|| strCellValue.Contains('\n'));
if (mustQuote)
{
m_strBuilder.Append('\"');
foreach (char nextChar in strCellValue)
{
m_strBuilder.Append(nextChar);
if (nextChar == '"') m_strBuilder.Append('\"');
}
m_strBuilder.Append('\"');
}
else
{
m_strBuilder.Append(strCellValue);
}
m_nCurrentColumnIndex++;
}
/// <summary>
/// end of line, new line
/// </summary>
public void appendNewLine()
{
m_strBuilder.Append(Environment.NewLine);
m_nCurrentColumnIndex = 0;
}
/// <summary>
/// Create the CSV file
/// </summary>
/// <param name="path"></param>
public void save(string path )
{
File.WriteAllText(path, ToString());
}
public override string ToString()
{
return m_strBuilder.ToString();
}
private StringBuilder m_strBuilder = new StringBuilder();
private char m_csvSeparator;
private int m_nCurrentColumnIndex = 0;
}
How to use it:
void exportAsCsv( string strFileName )
{
CsvBuilder csvStringBuilder = new CsvBuilder(';');
csvStringBuilder.appendCell("#Header col 1 : Name");
csvStringBuilder.appendCell("col 2 : Value");
csvStringBuilder.appendNewLine();
foreach (Data data in m_dataSet)
{
csvStringBuilder.appendCell(data.getName());
csvStringBuilder.appendCell(data.getValue());
csvStringBuilder.appendNewLine();
}
csvStringBuilder.save(strFileName);
}
the first step in parsing this is removing the extra added " 's around your string. Once you do this, you should be able to deal with the embedded " as well as the ,'s.
After much deliberation, it was decided that import utility format was needed to be fixed. The escaping of the string was correct (as users indicated) but the format file that import utility used was incorrect and was causing it to break import.
Thanks all and special thanks to @dbt (up vote)
精彩评论