Problem with quantifiers and look-behind
### Ruby 1.8.7 ###
require 'rubygems开发者_C百科'
require 'oniguruma' # for look-behind
Oniguruma::ORegexp.new('h(?=\w*)')
# => /h(?=\w*)/
Oniguruma::ORegexp.new('(?<=\w*)o')
# => ArgumentError: Oniguruma Error: invalid pattern in look-behind
Oniguruma::ORegexp.new('(?<=\w)o')
# => /(?<=\w)o/
### Ruby 1.9.2 rc-2 ###
"hello".match(/h(?=\w*)/)
# => #<MatchData "h">
"hello".match(/(?<=\w*)o/)
# => SyntaxError: (irb):3: invalid pattern in look-behind: /(?<=\w*)o/
"hello".match(/(?<=\w)o/)
# => #<MatchData "o">
I can't using quantifiers with look-behind?
The issue is that Ruby doesn't support variable-length lookbehinds. Quantifiers aren't out per se, but they can't cause the length of the lookbehind to be nondeterministic.
Perl has the same restriction, as does just about every major language featuring regexes.
Try using the straightforward match (\w*)\W*?o
instead of the lookbehind.
I was banging my head against the same problem, and Borealid's answer helped explain the issue well.
However, that got me thinking. Maybe the quantifier does not need to be inside the lookbehind, but can be applied on the lookbehind itself?
"hello".match(/(?<=\w*)o/)
# => SyntaxError: (irb):3: invalid pattern in look-behind: /(?<=\w*)o/
"hello".match(/(?<=\w)*o/)
# => #<MatchData "o">
So now we have a variable number of constant-length lookbehinds. Seems to bypass the issue for me. :)
For those who found this thread in 2022 with Ruby version >= 2.0, use \K
:
$ ruby -e 'p ARGV.first.match(/h\K\w*/)' hello
#<MatchData "ello">
Quoted from https://ruby-doc.org/core-3.1.2/doc/regexp_rdoc.html
\K
- Uses an positive lookbehind of the content preceding \K in the regexp.
精彩评论