开发者

Parsing HTML Table using Regex

I am trying to extract the contents of the table using Regex.

I have removed most of the tags from the table, i am stuck with <br> , <a href >, <img > & <b> How to remove them ??

for <b> tag i tried this Regex

 \s*<b[^开发者_高级运维>]*>\s* 
(?<value>.*?)
 \s* </b>\s*

it worked for some lines and some its giving the out put as

<b class="saadirheader">Email:</b>

Can anyone help me removing these tags

<br> , <a href >, <img > and  <b>

Full Tags :-

<img src="Newrecord_files/spacer.gif" alt="" border="0" height="1" width="5">

<a href="mailto:first.last@email.org">

Thanking you,

Naveen HS


Use the following Regex:

(?:<br|<a href|<img|<b)(?:.(?!>))*.>

This Regex will match all the tags you mentioned above, and if there are more tags you forgot to mention just add a "|" sign with the tag you want to add, and insert it into the first parentheses.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜