IWebBrowser: How to specify the encoding when loading html from a stream?
Using the concepts from the sample code provided by Microsoft for loading HTML content into an IWebBrowser from an IStream using the web browser's IPers开发者_运维百科istStreamInit interface:
pseudocode:
void LoadWebBrowserFromStream(IWebBrowser webBrowser, IStream stream)
{
IPersistStreamInit persist = webBrowser.Document as IPersistStreamInit;
persist.Load(stream);
}
How can one specify the encoding of the html inside the IStream? The IStream will contain a series of bytes, but the problem is what do those bytes represent? They could, for example, contain bytes where:
- each byte represents a character from the current Windows code-page (e.g. 1252)
- each byte could represent a character from the ISO-8859-1 character set
- the bytes could represent UTF-8 encoded characters
- every 2 bytes could represent a character, using UTF-16 encoding
In my particular case, i am providing the IWebBrowser an IStream that contains a series of double-bytes characters (UTF-16), but the browser (incorrectly) believes that UTF-8 encoding is in effect. This results in garbled characters.
Workaround solution
While the question asks how to specify the encoding, in my particular case, with only UTF-16 encoding, there's a simple workaround. Adding the 0xFEFF Byte Order Mark (BOM) indicates that the text is UTF-16 unicode. ie then uses the proper encoding and shows the text properly.
Of course that wouldn't work if the text were encoded, for example with:
- UCS-2
- UCS-4
- ISO-10646-UCS-2
- UNICODE-1-1-UTF-8
- UNICODE-2-0-UTF-16
- UNICODE-2-0-UTF-8
- US-ASCII
- ISO-8859-1
- ISO-8859-2
- ISO-8859-3
- ISO-8859-4
- ISO-8859-5
- ISO-8859-6
- ISO-8859-7
- ISO-8859-8
- ISO-8859-9
- WINDOWS-1250
- WINDOWS-1251
- WINDOWS-1252
- WINDOWS-1253
- WINDOWS-1254
- WINDOWS-1255
- WINDOWS-1256
- WINDOWS-1257
- WINDOWS-1258
IE's document supports IPersistMoniker loading too. IE uses URL monikers for downloading. You can replace the url moniker created by CreateURLMonikerEx with your own moniker. A few details about URL moniker's implementation can be find here. See if you can get IHTTPNegotiate from the binding context when your BindToStroage implemetation is called.
精彩评论