开发者

How to prevent specific comma delimted values being parsed (C#) [duplicate]

This question already has answers here: Closed 11 years ago.

Possible Duplicate:

Dealing with commas in a CSV file

I am currently parsing values from a CSV file and adding them to a datatable.

The csv file contains 5 columns and am parsing each row before adding it to the datatable.

After parsing the csv, the datatable could be visualized as the following:

|  Town/City  | Cost |
| Birmingham  | 400  |
| Manchester  | 500  |

For this data, there are no problems. However, I have some values that look like the following:

|  Town/City    | Cost |
|  London, West | 800  |

As there is a comma bet开发者_运维知识库ween a value for the one column, it is obviously parsing this as a seperate column.

The data cannot be changed, therefore I need a way to parse this as a single column rather than two.

This is my code so far that parses rows which have 5 columns. I have commented the bit where I guess the new code will need to go.

        //parse csv file and return as data table
    public System.Data.DataTable GetCsvData()
    {
        string strLine;
        char[] charArray = new char[] { ',' };

        List<string> strList = new List<string>();

        System.Data.DataTable dt = new System.Data.DataTable("csvData");
        System.IO.FileStream fileStream = null;
        System.IO.StreamReader streamReader = null;

        if (!string.IsNullOrEmpty(csvFilePath))
        {
            fileStream = new System.IO.FileStream(csvFilePath, System.IO.FileMode.Open);
            streamReader = new System.IO.StreamReader(fileStream);

            strLine = streamReader.ReadLine();

            strList = strLine.Split(charArray).ToList();

            //only add first 5 columns
            for (int i = 0; i <= 4; i++)
                dt.Columns.Add(strList[i].Trim());

            strLine = streamReader.ReadLine();

            while (strLine != null)
            {
                strList = strLine.Split(charArray).ToList();

                System.Data.DataRow dataRow = dt.NewRow();

                /*THIS CODE PARSES THE ROW'S 5 COLUMNS AND NEEDS TO PARSE COMMA
                SEPERATED VALUES AS A SINGLE VALUE*/
                for (int i = 0; i <= 4; i++)
                    dataRow[i] = strList[i].Trim();

                dt.Rows.Add(dataRow);

                strLine = streamReader.ReadLine();
            }

            streamReader.Close();
            return dt;
        }

        return null;
    }

Any help with this would be greatly appreciated as I am struggling to find answers on google.


I propose checking the array after the split. If you find it has N + 1 columns (where you expect N), merge the two City columns and shift the others down (strList[i] = strList[i+1]). Otherwise process as normal.

Of course this only works if you have only the one column that has a potential comma.


In addition to just checking the length of the split array as @Bahri suggests, if your data is predictable enough (as in your example), you could check column content.

If cost in your example is always a number, you could check to see if it contains only digits (or use a Regex for more complex matching). If not, then collapse the previous two columns.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜