German Letters encoding problem
I get HTML from 开发者_C百科a webpage that is in german language, i have to insert its html in database, but when I insert it in database the german letters does not appear coorectly.
E.g. Bundesstraße
appears as Bundesstraße
. I am using C# and MYsql database.
It seems like special characters are encoded as html entities (http://www.w3schools.com/tags/ref_entities.asp) on the website. When using UTF8 this isn't necessary, but many sites still do it.
If you want to have the exact html as it is on the website these encoded entities are correct.
To decode the entities you can use System.Net.WebUtility.HtmlDecode(yourString)
.
What encoding are you using?
Try switching to UTF-8 and ensure your database supports it. It looks as if though your string is getting HTML encoding, this is fine for presentation, but you'll need the original format to store it in the database.
In HTML, ß
is encoded as ß
.
You say "i have to insert its html in database", and what you're currently getting is correct.
精彩评论