开发者

Encoding problems in ASP when using English and Chinese characters

I am having problems with encoding Chinese in an ASP site. The file formats are:

  • translations.txt - UTF-8 (to store my translations)
  • test.asp - UTF-8 - (to render the page)

test.asp is reading translations.txt that contains the following data:

Help|ZH|帮助 
Home|ZH|首页

The test.asp splits on the pipe delimiter and if the user contains a cookie with ZH, it will display this translation, else it will just revert back to the Key value.

Now, I have tried the following things, which have not worked:

  1. Add a meta tag

    <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>

  2. Set the Response.CharSet = "UTF-8"

  3. Set the Response.ContentType = "text/html"
  4. Set the Session.CodePage (and Response) to both 65001 (UTF-8)
  5. I have confirmed that the text in translations.txt is definitely in UTF-8 and has no byte order mark
  6. The browser is picking up that the page is Unicode UTF-8, but the page is displaying gobbledegook.
  7. The Scripting.OpenTextFile(<file>,<create>,<iomode>,<e开发者_如何学Goncoding>) method returns the same incorrect text regardless of the Encoding parameter.

Here is a sample of what I want to be displayed in China (ZH):

  • 首页
  • 帮助

But the following is displayed:

  • 首页
  • 帮助

This occurs all tested browsers - Google Chrome, IE 7/8, and Firefox 4. The font definitely has a Chinese branch of glyphs. Also, I do have Eastern languages installed.

--

I have tried pasting in the original value into the HTML, which did work (but note this is a hard coded value).

  • 首页
  • 首页

However, this is odd.

首页 --(in hex)--> E9 A6 96 E9 A1 --(as chars)--> 首页

Any ideas what I am missing?


In order to read the UTF-8 file, you'll probably need to use the ADODB.Stream object. I don't claim to be an expert on character encoding, but this test worked for me:

test.txt (saved as UTF-8 without BOM):

首页
帮助

test.vbs

Option Explicit

Const adTypeText = 2
Const adReadLine = -2

Dim stream : Set stream = CreateObject("ADODB.Stream")
stream.Open
stream.Type = adTypeText
stream.Charset = "UTF-8"
stream.LoadFromFile "test.txt"

Do Until stream.EOS
    WScript.Echo stream.ReadText(adReadLine)
Loop

stream.Close


Whatever part of the process is reading the translations.txt file does not seem to understand that the file is in UTF-8. It looks like it is reading it in as some other encoding. You should specify encoding in whatever process is opening and reading that file. This will be different from the encoding of your web page.

Inserting the byte order mark at the beginning of that file may also be a solution.


Scripting.OpenTextFile does not understand UTF-8 at all. It can only read the current OEM encoding or Unicode. As you can see from the number of bytes being used for some character sets UTF-8 is quite inefficient. I would recommend Unicode for this sort of data.

You should save the file as Unicode (in Windows parlance) and then open with:

Dim stream : Set stream = Scripting.OpenTextFile(yourFilePath, 1, false, -1)


Just use the script below at the top of your page

Response.CodePage=65001
Response.CharSet="UTF-8"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜