IE7 won't display bmp files with encoded filenames
I have a test page that displays two images. One called hello.bmp and another called 徘吐驴欸觰.bmp (this is a rand开发者_如何学编程om collection of Chinese characters - apologies if it means something weird). For the latter image, I use an encoded format in the page's HTML.
The html is pretty straight forward:
<img src="%E5%BE%98%E5%90%90%E9%A9%B4%E6%AC%B8%E8%A7%B0.bmp" />
<img src="hello.bmp" />
In Internet explore 7, the encoded filepath does not display (Red x). All other browsers display it.
Does anyone know what would cause this? Can it be avoided?
Character encoding of file:///
URLs works differently across browsers on Windows.
Windows filenames are natively Unicode-based, so when you use a URL, which is byte-based, it has to convert that sequence of bytes to Unicode characters using an encoding. What encoding? There is no standard to say, but there are two obvious possibilities:
UTF-8, since it covers everything and is a popular default encoding, also used by the IRI standard for putting Unicode in URIs;
the (misleadingly-named) “ANSI” code page, which is an arbitrary default that varies from system to system. On a Western European Windows install it will be code page 1252 (which is similar to ISO-8859-1); on a Chinese Windows install it will be code page 936 (similar to GB2312).
The ANSI code page is a pain because you never know what it's going to be, it's never UTF-8, and if your filename contains characters that don't exist in ANSI—which will certainly be the case if you have the filename 徘吐驴欸觰.bmp
on a Western Windows install—you can't access the file at all.
So which do the browsers use?
- IE: ANSI code page
- Safari/Opera: UTF-8
- Chrome/Firefox: UTF-8, unless the bytes are not a valid UTF-8 sequence, in which case the ANSI code page is used instead.
So in conclusion, you can't reliably use non-ASCII characters in file:/// URLs at all.
This is in contrast to HTTP. The IIS web server, for example, has the same UTF-8-with-fallback-to-ANSI behaviour as Chrome and Firefox. Non-ASCII characters via IRI and a suitably-configured server are fine, but not the local filesystem.
(On non-Windows platforms filenames are natively bytes, usually representing UTF-8-encoded characters, but still bytes. Oo there is no ambiguity between the filesystem names and the byte-based URL %-sequences.)
die ANSI code page die. Why won't Microsoft kill you? You have long outstayed your welcome. You ruin everything.
精彩评论