开发者

Regex: how to get contents from tag inner (use javascript)?

page contents:

aa<b>1;2'3</b>hh<b>aaa</b>..
 .<b>bbb</b>
blabla..

i want to get result:

1;2'3aaabbb

match tag is <b> and </b>

how to write this regex 开发者_Go百科using javascript? thanks!


Lazyanno,

If and only if:

  1. you have read SLaks's post (as well as the previous article he links to), and
  2. you fully understand the numerous and wondrous ways in which extracting information from HTML using regular expressions can break, and
  3. you are confident that none of the concerns apply in your case (e.g. you can guarantee that your input will never contain nested, mismatched etc. <b>/</b> tags or occurrences of <b> or </b> within <script>...</script> or comment <!-- .. --> tags, etc.)
  4. you absolutely and positively want to proceed with regular expression extraction

...then use:

var str = "aa<b>1;2'3</b>hh<b>aaa</b>..\n.<b>bbb</b>\nblabla..";

var match, result = "", regex = /<b>(.*?)<\/b>/ig;
while (match = regex.exec(str)) { result += match[1]; }

alert(result);

Produces:

1;2'3aaabbb


You cannot parse HTML using regular expressions.

Instead, you should use Javascript's DOM.

For example (using jQuery):

var text = "";
$('<div>' + htmlSource + '</div>')
    .find('b')
    .each(function() { text += $(this).text(); });

I wrap the HTML in a <div> tag to find both nested and non-nested <b> elements.


Here is an example without a jQuery dependency:

// get all elements with a certain tag name
var b = document.getElementsByTagName("B");

// map() executes a function on each array member and
// builds a new array from the function results...
var text = b.map( function(element) {
  // ...in this case we are interested in the element text
  if (typeof element.textContent != "undefined")
    return element.textContent; // standards compliant browsers
  else
    return element.innerText;   // IE
});

// now that we have an array of strings, we can join it
var result = text.join('');


      var regex = /(<([^>]+)>)/ig;
      var bdy="aa<b>1;2'3</b>hh<b>aaa</b>..\n.<b>bbb</b>\nblabla..";

      var result =bdy.replace(regex, "");
      alert(result) ;

See : http://jsfiddle.net/abdennour/gJ64g/


Just use '?' character after the generating pattern for your inner text if you want to use Regular experssions. for example:

".*" to "(.*?)"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜