开发者

Problems with C# regular expression long load

I have a quiet long regex and sometimes it response fast some times it loads long like crazy.

here is my regex:

<div class=""rwResult bg"">.*?mp3/d/[^>]+>(?<Name>[^<]+)</a>.*?artist:[^>]+>(?<Artist>[^<]+).*?user</span>[^>]+[^""]+""(?<Uploader>[^""]+).*?category:.*?"">.*?"">(?<Category>[^<]+).*?time: (?<Duration>[^ ]+) \| (?<StreamSize>[0-9]+) (?<Weight>[^ ]+) \| listened: (?<Clicks>[0-9]+).*?<a href=""(?<DownloadLink>http://dl[^""]+)

rather than use alot of regex for each group i prefer doing one time regex. Is there any function t开发者_如何学运维hat i could check or avoid the long load while the regular expression is executing ?

I'm working C# or F# hope anyone could answer this problem.

thanks.


It looks like you are trying to parse an XML document using a regular expression. This is not really an optimal approach. My guess is that you are seeing problems because of the use of backtracking in your regular expression.

You could try to rewrite your regular expression, but XML is not a regular language and thus is not parsable by regular expressions.

Take a look at the document How to read XML from a file by using Visual C# to get started.

Sidenote: For an entertaining read on what happens when trying to parse a non regular language using regular expression see this Stack Overflow question.


I think you're using the wrong tool. You really want Xpath, and possibly XSLT. The only time you want to use a regex to parse raw XML is when the XML is suspected to be syntactically broken in predictable ways.

Seriously, look at Xpath - it's magic for delving into the structure of XML documents and pulling out the bits you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜