Chinese/japanese characters in a search box and form
Why is it that when I us开发者_Go百科e Firefox to enter: 漢
, the GET will transform to:
q=%E6%BC%A2&start=0
However, when I use IE8 and I type the same chinese character, the GET is:
q=?&start=0
It turns it into a question mark.
Mark the encoding of the page as UTF-8 and this problem will go away. Firefox will fail to autodetect your encoding without this hint sometimes, too. And you may have manually changed the encoding in IE once, so that becomes the new default for unmarked pages.
put this in your <HEAD>
:
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
If your content isn't really in UTF-8, then you'll need to use an alternate method. There's an html attribute on FORM that hints to IE that you want non-ANSI codepage characters to be sent as UTF-8, but it's far nicer to just use the correct content type.
Also, the address bar may not be the best place to look at the resulting text, as the last time I checked, it didn't reliably work with non-ACP characters. Make sure you're looking at the actual request data.
If you're talking about entering text into the address bar or search box in the browser, and not a specific web page, I don't reproduce this problem on English Windows 7. Perhaps you're using a very old version of Windows and your system ANSI code page does not contain that character; Win95/Win98/WinME would certainly have that problem.
Edited to add: In IE 8, entering the character you specified on a page containing this content works exactly as expected for me. I've verified this with Fiddler. Whatever problem you are having is probably different than what you have described so far.
<HTML>
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
</HEAD>
<BODY>
<form accept-charset="utf-8" method="get" action="http://www.example.com/something">
<input type="text" name="q">
<input type="submit">
</form>
</BODY>
</HTML>
You actually don't need the accept-charset unless you are using an alternate encoding for the page itself. But I am leaving it in for illustrative purposes. For it to be actually useful, at least in earlier versions of IE (things may have changed; a colleague of mine specified the behavior back in IE5 or so), you need a hidden "_charset_
" field with no value to encourage the browser to mark what charset it actually used, but that's superfluous in a utf-8 page).
It can either be font installation or URL encoding issue
One of main issue which I have seen when dealing with CJK characters is the installation of East Asian Language fonts not done by default when OS is installed. These characters show up properly in MS Word even without installation being done. To make sure all applications in OS can deal with CJK (Chinese, Japanese and Korean), doing the below exercise is better
- Go To Control Panel
- Select Regional And Language Options
- Go to language tab
- Select checkbox to install fonts for East Asian Languages
Hopefully you have the windows CD with you to proceed with this.
After that IE8 hopefully would show characters properly.
Also in case you are doing any url encoding make sure you always use UTF-8 as the character encoding when dealing with non ASCII characters.
To begin with, IE believes that Chinese characters can be sent 'as is' in UTF-8, while Firefox thinks they need to be URL-encoded.
Have you watched the GET request on the wire? I bet that it's really a three-byte sequence and that the tool you are using to display it is reducing it to a ?.
精彩评论