开发者

Problem reading dBase DBF with non-English characters

I have a tool which reads dBase files and uploads the contents to SQL Server, part of a system to import shapefiles. It works but now we have a requirement to import files that include non-English characters (Norwegian in this case, could be other languages later) and they're being corrupted.

The dBase files are being read using an OleDbDataAdapter. Stepping through the code I can see that the text is wrong as it is read in. I'm assuming it's something to do with code pages or Unicode but I have no idea how to fix it.

A dBase Reader application tells me the DBFs are in code page 1252 - I don't know if this is correct. My upload tool runs on Win7 with English (UK) regional settings.

Examples:

ÅSGARD in DBF becomes +SGARD in VB.Net & SQL Server.

RINGHORNE ØST in DBF becomes RINGHORNE ÏST in VB.Net & SQL Server.

The code开发者_如何学编程 that reads the DBF:

dbfConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=dBASE IV"
Cnn.ConnectionString = dbfConnectionString
Cnn.Open()

strSQL = "SELECT * FROM [" & strDBF & "]"
DA = New OleDb.OleDbDataAdapter(strSQL, Cnn)
DS = New DataSet
DA.Fill(DS)

If DS.Tables(0).Rows.Count > 0 Then
  dtDBF = DS.Tables(0)
Else
  dtDBF = Nothing
End If

Data is read like: Name = dtDBF.Rows(index)("NAME_1")

Is there a way to tell OleDbDataAdapter what code page to use or a better way to read dBase files from VB.Net?


Try adding this to your DSN:

CollatingSequence=Norwegian-Danish

You might also be able to use:

CollatingSequence=International


Check whether the shapefile contains codepage information. There are two places to look

  • Look in the language driver ID (LDID), which is found in the header of the shapefile’s DBF table (in the 29th byte).
  • Look for an associated separate file with extension .cpg.

If the code page is not specified in those locations, it defaults to the codepage on the PC that generated the shapefile. You will just have to know that :(

I've never used it, but maybe Shape2SQL takes care of this for you? Or shp2text? I believe the PostGIS shapefile loader handles code pages: maybe you could import into PostGIS and then export in another format??


Old question, but this may answer it for future readers...

You might try adding a property setting in your connection string:

Locale Identifier=1044

This property (and a list of values including this one) is documented for ADO in conjunction with Jet 4.0's OLDB Provider but I have no reason to believe it isn't also supported by ADO.Net. This value (1044) is Norwegian/Danish.

Untested, but something else to try.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜