开发者

RegexpError: Stack overflow in regexp matcher

I have small problem with a simple tokenizer regex:

def test_tokenizer_regex_limit
   string = '<p>a</p>' * 400
   tokens = string.scan(/(<\s*tag:.*?\/?>)|((?:[^<]|\<(?!\s*tag:.*?\/?>))+)/)
end

Basically it runs through the text and gets pairs of [ matched_tag , other_text ]. Here's an example: http://rubular.com/r/f88JBjfzFh

Works fine for smaller sets. If you run in under ruby 1.8.7 it will blow up. 1.9.2 work开发者_JAVA百科s fine.

Any ideas how to simplify / improve this? My regex-fu is weak


This is a bit more simplified but not much:

(<[^<]*:[^<]*>)|((?:[^<]|<[^:]*>)+)

(<.*?>|[^<>]+)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜