How to prevent specific comma delimted values being parsed (C#) [duplicate]
Possible Duplicate:
Dealing with commas in a CSV file
I am currently parsing values from a CSV file and adding them to a datatable.
The csv file contains 5 columns and am parsing each row before adding it to the datatable.
After parsing the csv, the datatable could be visualized as the following:
| Town/City | Cost |
| Birmingham | 400 |
| Manchester | 500 |
For this data, there are no problems. However, I have some values that look like the following:
| Town/City | Cost |
| London, West | 800 |
As there is a comma bet开发者_运维知识库ween a value for the one column, it is obviously parsing this as a seperate column.
The data cannot be changed, therefore I need a way to parse this as a single column rather than two.
This is my code so far that parses rows which have 5 columns. I have commented the bit where I guess the new code will need to go.
//parse csv file and return as data table
public System.Data.DataTable GetCsvData()
{
string strLine;
char[] charArray = new char[] { ',' };
List<string> strList = new List<string>();
System.Data.DataTable dt = new System.Data.DataTable("csvData");
System.IO.FileStream fileStream = null;
System.IO.StreamReader streamReader = null;
if (!string.IsNullOrEmpty(csvFilePath))
{
fileStream = new System.IO.FileStream(csvFilePath, System.IO.FileMode.Open);
streamReader = new System.IO.StreamReader(fileStream);
strLine = streamReader.ReadLine();
strList = strLine.Split(charArray).ToList();
//only add first 5 columns
for (int i = 0; i <= 4; i++)
dt.Columns.Add(strList[i].Trim());
strLine = streamReader.ReadLine();
while (strLine != null)
{
strList = strLine.Split(charArray).ToList();
System.Data.DataRow dataRow = dt.NewRow();
/*THIS CODE PARSES THE ROW'S 5 COLUMNS AND NEEDS TO PARSE COMMA
SEPERATED VALUES AS A SINGLE VALUE*/
for (int i = 0; i <= 4; i++)
dataRow[i] = strList[i].Trim();
dt.Rows.Add(dataRow);
strLine = streamReader.ReadLine();
}
streamReader.Close();
return dt;
}
return null;
}
Any help with this would be greatly appreciated as I am struggling to find answers on google.
I propose checking the array after the split. If you find it has N + 1 columns (where you expect N), merge the two City columns and shift the others down (strList[i] = strList[i+1]). Otherwise process as normal.
Of course this only works if you have only the one column that has a potential comma.
In addition to just checking the length of the split array as @Bahri suggests, if your data is predictable enough (as in your example), you could check column content.
If cost in your example is always a number, you could check to see if it contains only digits (or use a Regex for more complex matching). If not, then collapse the previous two columns.
精彩评论