Load csv into oleDB and force all inferred datatypes to string
Im trying to load a csv file into a datatable using oledb.
This is no problem but unfortunately one of the fields which looks numeric has a string value in about 3% of the fields and so is not being populated.
because im converting the csv into xml i really don't care about inferring datatypes and simply need the data in a string as i can cast it later in a Linq2XMl phase.
I am hoping to be able to do this in the connection string.
I don't want to just copy the table, set it up with new columns with the datatype I want and then write the data into it because that would involve loading the csv file twice.
any ideas?
my current connection string is
Provider=Microsoft.Jet.OLEDB.4.0;Data Sour开发者_JAVA技巧ce=" + thefile.DirectoryName + ";Extended Properties='text;HDR=Yes;FMT=Delimited'";
Did some researchand the answer is use a schema.ini but generate it on the fly for your dataset.
http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx
contains the info required. to construct the schema:
public static void ConstructSchema(FileInfo theFile)
{
StringBuilder schema = new StringBuilder();
DataTable data = LoadCSV(theFile);
schema.AppendLine("[" + theFile.Name + "]");
schema.AppendLine("ColNameHeader=True");
for (int i = 0; i < data.Columns.Count; i++)
{
schema.AppendLine("col" + (i + 1).ToString() + "=" + data.Columns[i].ColumnName + " Text");
}
string schemaFileName = theFile.DirectoryName + @"\Schema.ini";
TextWriter tw = new StreamWriter(schemaFileName);
tw.WriteLine(schema.ToString());
tw.Close();
}
to load the csv as datatable
public static DataTable LoadCSV(FileInfo theFile)
{
string sqlString = "Select * FROM [" + theFile.Name + "];";
string conStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ theFile.DirectoryName + ";" + "Extended Properties='text;HDR=YES;'";
DataTable theCSV = new DataTable();
using (OleDbConnection conn = new OleDbConnection(conStr))
{
using (OleDbCommand comm = new OleDbCommand(sqlString, conn))
{
using (OleDbDataAdapter adapter = new OleDbDataAdapter(comm))
{
adapter.Fill(theCSV);
}
}
}
return theCSV;
}
to convert to xml
public static XElement GetXMLFromCSV(FileInfo theFile, string rootNodeName, string itemName)
{
XElement retVal;
DataTable data;
data = CrateCsvAndSchema(theFile);
DataSet ds = new DataSet(rootNodeName);
data.TableName = itemName;
ds.Tables.Add(data);
retVal = XElement.Parse(ds.GetXml());
return retVal;
}
For reading a CSV into a DataTable I recommend this CSV parser.
It's really easy to use. Here's how you can use it to fill a DataTable with data from a comma delimited, quote qualified CSV:
DataTable dt = null;
using (GenericParserAdapter gp = new GenericParser.GenericParserAdapter(yourCsvFullname)) {
dt = gp.GetDataTable();
}
There are a number of options you can set: the delimiter, the text qualifer character(s)whether the first line in the CSV show column headers (if true, each DataColumn in your DataTable will be named accordingly), etc.
There are a number of fast, flexible CSV parsers out there but for simple requirements this one can't be beat.
精彩评论