开发者

Disadvantages of using Regular Expressions

Recently I was advised by my manager not to depend much on Regex as it has lot of disadvantages. When I tried to learn more , I hear that it has issues like regex can result in memory leak as some objects continue to hang on strings references even after use ?

.NET RegEx "Memory Leak" investigation

So it it right to say that reg-ex causes memory overheads and should not be used if you have other options ? Is there any other disadvantaged to reg-ex (apart from it being tough to learn :) )

P.S I am developing an application (c#.net) similar to web crawler which extracts all hrefs and some other information like title, meta t开发者_如何学Cags etc..I have the option of using HTML Agility pack instead of reg-ex.


Makes the code difficult to read. Most of the time, even at the expense of having more verbose code, you are better off not using regular expressions. The costly performance impact and the degradation in the readability of the code means that you don't use regexes in most of the cases, especially, the simpler ones and the complex ones.

And for the purpose you are mentioning ( parsing HTML etc. ), regular expressions simple cannot get the job done ( because HTML is not a regular language ). It is is like having a hammer and everything looks like a nail.


My view on this is that RegEx can often do the job but you need to be careful that you don't overuse them. As they say, when all you have is a hammer every problem looks like a nail.

In this case you are trying to parse HTML to get data out. An HTML parser will be both more readable and probably more reliable. Regular Expressions to parse HTML often will either fail in some circumstances (malformed HTML being the big one) or be way more complicated than if you used an HTML parser.

I don't know about the memory leaks and performance issues but even ignoring that I tend to try to keep regex use to a minimum.


Regular expressions can obfuscate the logic you are using; it may be less complex to do it in code sometimes. In code you can break the different logical tests up and comment each one so that people can see why you are doing what you are doing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜