RegEx - HTML between two values

2023-04-07 09:15 问答作者：

I am looking to get the html that is included between the following text:

<ul type="square">  
</ul>

What's the most effi开发者_JAVA百科cient way?

I always use XPath to do things like that.
Use an XPath that will extract the node and then you can fetch the InnerHTML from that node. Very clean, and the right tool for the job.

Additional details: The HAP Explorer is a nice tool for getting the XPath you need. Copy/paste the HTML into HAP Explorer, navigate to the node of interest, copy/paste the XPath for that node. Put that XPath string in a string resource, fetch it at runtime, apply it to the HTML document to extract the node, fetch the desired information from the node.

If you really want one:
@<ul type="square">(.*?)</ul>@im

I agree that an HTML parser is the correct way to solve this problem. But, to humor you and answer your original question purely for academic interest, I propose this:

/<[Uu][Ll] +type=("square"|square) *>((.*?(<ul[^>]*>.*</ul>)?)*)<\/[Uu][Ll]>/s

I'm sure there are cases where this will fail, but I can't think of any so please suggest /* them */ more.

Let me restate that I don't recommend you use this in your project. I am merely doing this out of academic interest, and as a demonstration of WHY a regex that parses html is bad and complicated.

Regular expressions should not be used to parse HTML!

This will definitely not work:

<ul type="square">(.*)</ul>

继续阅读：regex

RegEx - HTML between two values

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？