How do we create such a regular expression to extract data?

2022-12-11 18:11 问答作者：

<br>Aggie<br><br>John<br><p>Hello world</p><br>Mary<br><br><b>Peter</b><br>

I'd like to create a regexp that safely matches these:

<br>Aggie<br>
<br>John<br>
<br>Mary<br>
<br><b>Peter</b><br>

This is possible that there are oth开发者_如何学Cer tags (e.g. ,<strike>...etc ) between each pair of   and they have to be collected just like the  Peter 

How should the regexp look like?

If you learn one thing on SO, let it be - "Do not parse HTML with a regex". Use an HTML Parser

<br>.*?<br>

will match anything from one   tag to the closest following one.

The main problem with parsing HTML using regexes is that regexes can't handle arbitrarily nested structures. This is not a problem in your example.

Split the string at ( )+. You'll get empty strings at the beginning and the end of the result, so you need to remove them, too.

If you want to preserve the  , then this is not possible unless you know that there is one before and after each element in the result.

继续阅读：regex string

How do we create such a regular expression to extract data?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？