Problem reading dBase DBF with non-English characters
I have a tool which reads dBase files and uploads the contents to SQL Server, part of a system to import shapefiles. It works but now we have a requirement to import files that include non-English characters (Norwegian in this case, could be other languages later) and they're being corrupted.
The dBase files are being read using an OleDbDataAdapter. Stepping through the code I can see that the text is wrong as it is read in. I'm assuming it's something to do with code pages or Unicode but I have no idea how to fix it.
A dBase Reader application tells me the DBFs are in code page 1252 - I don't know if this is correct. My upload tool runs on Win7 with English (UK) regional settings.
Examples:
ÅSGARD in DBF becomes +SGARD in VB.Net & SQL Server.
RINGHORNE ØST in DBF becomes RINGHORNE ÏST in VB.Net & SQL Server.
The code开发者_如何学编程 that reads the DBF:
dbfConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=dBASE IV"
Cnn.ConnectionString = dbfConnectionString
Cnn.Open()
strSQL = "SELECT * FROM [" & strDBF & "]"
DA = New OleDb.OleDbDataAdapter(strSQL, Cnn)
DS = New DataSet
DA.Fill(DS)
If DS.Tables(0).Rows.Count > 0 Then
dtDBF = DS.Tables(0)
Else
dtDBF = Nothing
End If
Data is read like: Name = dtDBF.Rows(index)("NAME_1")
Is there a way to tell OleDbDataAdapter what code page to use or a better way to read dBase files from VB.Net?
Try adding this to your DSN:
CollatingSequence=Norwegian-Danish
You might also be able to use:
CollatingSequence=International
Check whether the shapefile contains codepage information. There are two places to look
- Look in the language driver ID (LDID), which is found in the header of the shapefile’s DBF table (in the 29th byte).
- Look for an associated separate file with extension
.cpg
.
If the code page is not specified in those locations, it defaults to the codepage on the PC that generated the shapefile. You will just have to know that :(
I've never used it, but maybe Shape2SQL takes care of this for you? Or shp2text? I believe the PostGIS shapefile loader handles code pages: maybe you could import into PostGIS and then export in another format??
Old question, but this may answer it for future readers...
You might try adding a property setting in your connection string:
Locale Identifier=1044
This property (and a list of values including this one) is documented for ADO in conjunction with Jet 4.0's OLDB Provider but I have no reason to believe it isn't also supported by ADO.Net. This value (1044
) is Norwegian/Danish.
Untested, but something else to try.
精彩评论