Advanced .Csv parse - Survey answer file?
Ok firstly I'd just like to point out that I'm aware of parsing .csv files using commas or tab spaces etc. I still have a problem however and I'm a little stuck.
What I'm trying to do is build an application that reads in a .csv survey answer file (preferably all extension types but lets start with one). These survey a开发者_开发问答nswer files are pre generated by other websites. (i.e the user downloads their survey answers from a survey site and then uses my application). The purpose of the application is performing statistical analysis on the data.
So the problem I'm having is figuring out how to read in and separate questions- from answers- from irrelevant text. I need a reusable way of doing this for multiple answer files with different question types etc.
I know an easier method of doing this would be to have the user create a survey with my application and then analyze it, so I can control the formatting but at the moment this is not an option.
NOTE: I plan on reading all the variables in to the system, and then allow the user to select variables from a list and execute analysis algorithms on them.
Again I know their are advanced csv readers out there I'm just looking for ideas on how to go about my problem.
use Microsoft.VisualBasic.FileIO.TextFieldParser
it is specifically designed to parse .csv files. it handles commas in fields too.
For parsing CSV, you could use a regular expression I describe in my solution to this post. This would be evaluated line-by-line.
Does the first row of your file (CSV (delimiter is comma) or TSV (delimiter is tab)) hold the 'column' names? Do all rows have the same number of values (if necessary, with missing or null values being designated by consecutive delimiters)?
If the answers to both questions are in the affirmative, one option is to use ADO with the JET 4.0 driver to read each file as a relational data source.
There are plenty of samples that demonstrate the technique. Start here.
精彩评论