improving Perl regex performance by adding +

2023-01-09 14:40 问答作者：

I have some regexes in a Perl script that are correct but slow. I am considering trying to improve performance by adding extra + operators (ie *+ instead of * and ++ instead of +) to disable backtracking. I tried replacing all of them and the regexes stopped working... so much for the开发者_如何转开发 simple solution. How do I know where I can add them where it won't break the regex?

If the regexes stopped working, you either aren't using a version of perl that supports them, or you actually do need backtracking in those cases.

Identify sections of the regex that won't ever need backtracking (that is, that if asked to match starting at a given point, there will never be more than one length you might want them to match), and surround them with (?> ). This has the same effect as ++/*+ and is supported even pre-5.10.

Note that restricting backtracking is often not "optimization", since it changes what will and will not be matched. The idea is that you use it to better describe what you actually want matched. Borrowing from the article linked in the OP's answer, something like ^(.*?,){11}P (twelfth comma separated field starts P) is not just inefficient, it is incorrect, since backtracking will cause it to actually match even when only a field after the twelfth starts with P. By correcting it to ^(?>.*?,){11}P you are restricting it to actually matching the correct number of leading fields. (In this trivial case, ^([^,]*,){11}P also does the job, but if you add in support for escaped or quoted commas within fields using alternation, (?> becomes the easier choice.)

Hmmm... once I posted the question, looking at the "Related" column led me to this which has some pretty good ideas.... http://www.regular-expressions.info/catastrophic.html

继续阅读：performance perl regex

improving Perl regex performance by adding +

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？