开发者

Url with Unicode - ISAPI_Rewrite doesnt recognize it

I use ISAPI_Rewrite v2 for url rewriting quite a while. The site is in the Hebrew language and so the pages urls. ISAPI_Rewrite v2 doesnt support Hebrew characters, but I overcome this problem by using UTF-8(Hex) code for the hebrew characters. Here is an example:

RewriteRule ^/\%D7\%A6\%D7\%95\%D7\%A8_\%D7\%A7\%D7\%A9\%D7\%A8/$ /Contact.aspx [L,I]
RewriteRule ^/\%D7\%A6\%D7\%95\%D7\%A8_\%D7\%A7\%D7\%A9\%D7\%A8$ /Contact.aspx [L,I]

The problem: While checking my popular pages in statcounter I came across this url:

http://mysite.com/%u05F6%u05E5%u05F8_%u05F7%u05F9%u05F8

Which is the same URL rule as in my example but in Unicode! And apparently ISAPI_Rewrite v2 doesnt handle this URLs, And I the user get "The page cannot be found".

There is also pages that are more complex, for example send part of the URL as a query parameter.. Which also in Unicode.

I tho开发者_开发知识库ugh only on one solution - make the same rules, this time in Unicode and deal with the Unicode in the code behind. But there's 2 problems with the solution:

  1. The URL shows for the user in Unicode and not in the Hebrew language.
  2. More code in the code behind which, for my opinion, doesnt need to be. What I mean is that this scenario can/need to be handle before it reach the code..

Any thoughts?

Thanks.

EDIT: Maybe this redirection can be accomplish by IIS6 somehow? When ever the IIS identify Unicode URL, it convert it to UTF-8 and redirect the page.


ISAPI_Rewrite v2 doesnt support Hebrew characters, but I overcome this problem by using UTF-8

IIS in general requires you to use UTF-8 in URLs. There is a fallback to using the default locale-specific (‘ANSI’) encoding when the URL isn't a valid UTF-8 sequence, but that's (a) no use if your server's locale isn't Hebrew (code page 1255), and (b) still not wholly reliable as some cp1255 strings can also be valid UTF-8 sequences. So, yes, for reliability always use the UTF-8 form.

http://mysite.com/%u05F6%u05E5%u05F8_%u05F7%u05F9%u05F8

Which is the same URL rule as in my example but in Unicode!

Not really. The %uxxxx syntax comes from the JavaScript escape() function and is specific to that's function's custom form of encoding. It has no relation to standard URL-encoding. The above is not even a valid URL and won't be accepted by some browsers.

You need to find where that link is coming from and fix it to use proper UTF-8-%xx-encoding instead.

In the meantime you might be able to do something with a 404 handler that redirects to the canonical form instead.


If you use some FastCGI extension behind IIS you can try configure to configure FastCGI to use UTF-8 encoding for a particular set of server variables, use the REG_MULTI_SZ registry key FastCGIUtf8ServerVariables and set its value to a list of server variable names.

reg add HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\w3svc\Parameters /v FastCGIUtf8ServerVariables /t REG_MULTI_SZ /d REQUEST_URI\0PATH_INFO

https://www.iis.net/learn/application-frameworks/install-and-configure-php-on-iis/configuring-the-fastcgi-extension-for-iis-60#utf8servervars

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜