Url with Unicode - ISAPI_Rewrite doesnt recognize it
I use ISAPI_Rewrite v2 for url rewriting quite a while. The site is in the Hebrew language and so the pages urls. ISAPI_Rewrite v2 doesnt support Hebrew characters, but I overcome this problem by using UTF-8(Hex) code for the hebrew characters. Here is an example:
RewriteRule ^/\%D7\%A6\%D7\%95\%D7\%A8_\%D7\%A7\%D7\%A9\%D7\%A8/$ /Contact.aspx [L,I]
RewriteRule ^/\%D7\%A6\%D7\%95\%D7\%A8_\%D7\%A7\%D7\%A9\%D7\%A8$ /Contact.aspx [L,I]
The problem: While checking my popular pages in statcounter I came across this url:
http://mysite.com/%u05F6%u05E5%u05F8_%u05F7%u05F9%u05F8
Which is the same URL rule as in my example but in Unicode! And apparently ISAPI_Rewrite v2 doesnt handle this URLs, And I the user get "The page cannot be found".
There is also pages that are more complex, for example send part of the URL as a query parameter.. Which also in Unicode.
I tho开发者_开发知识库ugh only on one solution - make the same rules, this time in Unicode and deal with the Unicode in the code behind. But there's 2 problems with the solution:
- The URL shows for the user in Unicode and not in the Hebrew language.
- More code in the code behind which, for my opinion, doesnt need to be. What I mean is that this scenario can/need to be handle before it reach the code..
Any thoughts?
Thanks.
EDIT: Maybe this redirection can be accomplish by IIS6 somehow? When ever the IIS identify Unicode URL, it convert it to UTF-8 and redirect the page.
ISAPI_Rewrite v2 doesnt support Hebrew characters, but I overcome this problem by using UTF-8
IIS in general requires you to use UTF-8 in URLs. There is a fallback to using the default locale-specific (‘ANSI’) encoding when the URL isn't a valid UTF-8 sequence, but that's (a) no use if your server's locale isn't Hebrew (code page 1255), and (b) still not wholly reliable as some cp1255 strings can also be valid UTF-8 sequences. So, yes, for reliability always use the UTF-8 form.
http://mysite.com/%u05F6%u05E5%u05F8_%u05F7%u05F9%u05F8
Which is the same URL rule as in my example but in Unicode!
Not really. The %uxxxx
syntax comes from the JavaScript escape()
function and is specific to that's function's custom form of encoding. It has no relation to standard URL-encoding. The above is not even a valid URL and won't be accepted by some browsers.
You need to find where that link is coming from and fix it to use proper UTF-8-%xx-encoding instead.
In the meantime you might be able to do something with a 404 handler that redirects to the canonical form instead.
If you use some FastCGI extension behind IIS you can try configure to configure FastCGI to use UTF-8 encoding for a particular set of server variables, use the REG_MULTI_SZ registry key FastCGIUtf8ServerVariables and set its value to a list of server variable names.
reg add HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\w3svc\Parameters /v FastCGIUtf8ServerVariables /t REG_MULTI_SZ /d REQUEST_URI\0PATH_INFO
https://www.iis.net/learn/application-frameworks/install-and-configure-php-on-iis/configuring-the-fastcgi-extension-for-iis-60#utf8servervars
精彩评论