开发者

Problem with quantifiers and look-behind

### Ruby 1.8.7 ###

require 'rubygems开发者_C百科'
require 'oniguruma' # for look-behind

Oniguruma::ORegexp.new('h(?=\w*)')
# => /h(?=\w*)/

Oniguruma::ORegexp.new('(?<=\w*)o')
# => ArgumentError: Oniguruma Error: invalid pattern in look-behind

Oniguruma::ORegexp.new('(?<=\w)o')
# => /(?<=\w)o/


### Ruby 1.9.2 rc-2 ###

"hello".match(/h(?=\w*)/)
# => #<MatchData "h">

"hello".match(/(?<=\w*)o/)
# => SyntaxError: (irb):3: invalid pattern in look-behind: /(?<=\w*)o/

"hello".match(/(?<=\w)o/)
# => #<MatchData "o"> 

I can't using quantifiers with look-behind?


The issue is that Ruby doesn't support variable-length lookbehinds. Quantifiers aren't out per se, but they can't cause the length of the lookbehind to be nondeterministic.

Perl has the same restriction, as does just about every major language featuring regexes.

Try using the straightforward match (\w*)\W*?o instead of the lookbehind.


I was banging my head against the same problem, and Borealid's answer helped explain the issue well.

However, that got me thinking. Maybe the quantifier does not need to be inside the lookbehind, but can be applied on the lookbehind itself?

"hello".match(/(?<=\w*)o/)
# => SyntaxError: (irb):3: invalid pattern in look-behind: /(?<=\w*)o/

"hello".match(/(?<=\w)*o/)
# => #<MatchData "o">

So now we have a variable number of constant-length lookbehinds. Seems to bypass the issue for me. :)


For those who found this thread in 2022 with Ruby version >= 2.0, use \K:

$ ruby -e 'p ARGV.first.match(/h\K\w*/)' hello
#<MatchData "ello">

Quoted from https://ruby-doc.org/core-3.1.2/doc/regexp_rdoc.html

\K - Uses an positive lookbehind of the content preceding \K in the regexp.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜