Finding the xml escaped dash in Javascript
I want to use regular expression to find dashes in an html in javascript. The dashes in html pag开发者_如何学Pythones sometimes may be xml escaped with the string value of –. However, using regular expression to find this string is not working for some reason.
var html = document.getElementsByTagName('html').item(0).innerHTML;
var escapedDash = /–/ig;
var foundEscapedDash = html.match(escapedDash);
alert(foundEscapedDash);
The regular experession, /–/ig does not result in any values. Nor does the regular expression /-/i find the escaped dash –
Does anyone know of a regular expression that can find the escaped dash?
When you set innerHTML to a string with an entity, it converts it to the literal character. For example:
var div = document.createElement('div');
div.innerHTML = '–'
alert(div.innerHTML.length); // 1, not 7 as may be expected
So you need to match the actual character &ndash, and to do that, you can use the unicode literal representation. For "–", it's \u2013.
div.innerHTML.match(/\u2013/ig)
By the way, assuming the dash is the first character of the string, you can find the hex number 0x2013 for yourself with div.innerHTML.charCodeAt(0).toString(16).
Try this:
var str = '–hello world –';
var escapedDash = /(–+)/ig;
var foundEscapedDash = str.match(escapedDash);
alert(foundEscapedDash);
加载中,请稍侯......
精彩评论