开发者

Finding the xml escaped dash in Javascript

I want to use regular expression to find dashes in an html in javascript. The dashes in html pag开发者_如何学Pythones sometimes may be xml escaped with the string value of –. However, using regular expression to find this string is not working for some reason.

var html = document.getElementsByTagName('html').item(0).innerHTML;
var escapedDash     = /–/ig;
var foundEscapedDash = html.match(escapedDash);
alert(foundEscapedDash);

The regular experession, /–/ig does not result in any values. Nor does the regular expression /-/i find the escaped dash –

Does anyone know of a regular expression that can find the escaped dash?


When you set innerHTML to a string with an entity, it converts it to the literal character. For example:

var div = document.createElement('div');
div.innerHTML = '–'
alert(div.innerHTML.length); // 1, not 7 as may be expected

So you need to match the actual character &ndash, and to do that, you can use the unicode literal representation. For "–", it's \u2013.

div.innerHTML.match(/\u2013/ig)

By the way, assuming the dash is the first character of the string, you can find the hex number 0x2013 for yourself with div.innerHTML.charCodeAt(0).toString(16).


Try this:

var str = '–hello world –';
var escapedDash = /(–+)/ig;

var foundEscapedDash = str.match(escapedDash);
alert(foundEscapedDash);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜