开发者

javascript. handling odd characters in encoded string

I have gotten a value, encoded like so:

%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%

I noticed that one of the characters seems to be encoded in a different manner at the end, the %u2013. It appears to be some form of unicode character, but it is causing me to get URI malformed errors. is there a w开发者_如何学运维ay to replace these with standard encoding characters? In this example, it seems %u2013 is supposed to be a hyphen.


To be complete and more correct, the regular expression should also accept letters from A to F, since the %u2013 refers to a four-digit hexadecimal number. And you should definitely include the percent sign in the regular expression, otherwise you end up interpreting Blu2000 as a Unicode escape sequence, which it isn't.

function fixUnicodeUrl(url) {
    var result = url.replace(/%u[0-9a-f]{4}/gi, function (match) {
        var codepoint = parseInt(match.substring(2), 16);
        var str = String.fromCharCode(codepoint);
        return encodeURIComponent(str);
    });
    return result;
}

var yourUrl = '%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%';
alert(fixUnicodeUrl(yourUrl));


That is malformed for sure. Where are you getting it from?

Here's a way to fix all occurrences of that type of malformation.

var str = '%3Cp%3E%0AGlobal%20Business%20Intensive%20Course%20%u2013%';

str = str.replace( /u\d{4}/g, function( sequence )
{
  return encodeURIComponent( eval( '"\\' + sequence + '"' ) );
} );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜