开发者

Serving ISO-8859-1 and UTF-8 files from ASP.NET

We have a large si开发者_StackOverflow社区te with thousands of static html files. Some of them are ISO-8859-1, others are UTF-8 (with and without byte order marks).

The web.config file looks like this:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <system.web>
    <globalization requestEncoding="utf-8" responseEncoding="utf-8" fileEncoding="utf-8" />
  </system.web>
</configuration>

If I change the fileEncoding to "ISO-8859-1" it works for both ISO-8859-1 and UTF-8 if there is a byte order mark. We are trying to avoid manually checking and adding byte order marks to files that do not have them. Is there any way to accomplish this?

The files have the charset meta tag. Can we make the server read that to determine the file encoding?

EDIT

If I remove the wildcard application mapping to aspnet_isapi.dll the files are served correctly. Is there any way to have the wildcard match everything except for .html?


You could automate the checking of the files to see if they contain said characters / if they are utf-8.

This is a piece of code which pops up while googling that purpose found here:

System.Text.Encoding enc = null; 
System.IO.FileStream file = new System.IO.FileStream(filePath, 
    FileMode.Open, FileAccess.Read, FileShare.Read); 
if (file.CanSeek) 
{ 
    byte[] bom = new byte[4]; // Get the byte-order mark, if there is one 
    file.Read(bom, 0, 4); 
    if ((bom[0] == 0xef && bom[1] == 0xbb && bom[2] == 0xbf) || // utf-8 
        (bom[0] == 0xff && bom[1] == 0xfe) || // ucs-2le, ucs-4le, and ucs-16le 
        (bom[0] == 0xfe && bom[1] == 0xff) || // utf-16 and ucs-2 
        (bom[0] == 0 && bom[1] == 0 && bom[2] == 0xfe && bom[3] == 0xff)) // ucs-4 
    { 
        enc = System.Text.Encoding.Unicode; 
    } 
    else 
    { 
        enc = System.Text.Encoding.ASCII; 
    } 

    // Now reposition the file cursor back to the start of the file 
    file.Seek(0, System.IO.SeekOrigin.Begin); 
} 
else 
{ 
    // The file cannot be randomly accessed, so you need to decide what to set the default to 
    // based on the data provided. If you're expecting data from a lot of older applications, 
    // default your encoding to Encoding.ASCII. If you're expecting data from a lot of newer 
    // applications, default your encoding to Encoding.Unicode. Also, since binary files are 
    // single byte-based, so you will want to use Encoding.ASCII, even though you'll probably 
    // never need to use the encoding then since the Encoding classes are really meant to get 
    // strings from the byte array that is the file. 

    enc = System.Text.Encoding.ASCII; 
} 
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜