Ruby 1.9 Regex Lookbehind Assertion & Anchors
Ruby 1.9 regex supports lookbehind assertion but I seem to have difficulty when passing anchors in the pattern. When anchors are passed in the lookahead assertion it runs just fine.
"well substr开发者_C百科ing! "[/(?<=^|\A|\s|\b)substring!(?=$|\Z|\s|\b)/] #=> RegexpError: invalid pattern in look-behind: /(?<=^|\A|\s|\b)substring(?=$|\Z|\s|\b)/
Does anybody know how to make anchors work in lookbehind assertions as it does in lookahead?
Is there a special escape sequence or grouping that is required for lookbehind?
I have tested this behavior in 1.9.1-p243, p376 and 1.9.2-preview3 just in case it was patched.
Looks like you're right: \b
works as expected in a lookahead, but in a lookbehind it's treated as a syntax error.
It doesn't really matter in this case: if (?<=^|\A|\s|\b)
would have yielded the desired result, \b
is all you needed anyway. The character following the assertion has to be s
--a word character--so \b
means either (1) the previous character is not a word character, or (2) there is no previous character. That being the case, ^
, \A
and \s
are all redundant.
However, if the string starts with !
it's a different story. ^
and \A
still match the beginning of the string, before the !
, but \b
matches after it. If you want to match !substring!
as a complete string you have to use /\A!substring!\Z/
, but if you only want to match the whole word substring
you have to use /\bsubstring\b/
.
As for [^\B]
, that just matches any character except B
. Like \b
, \B
is a zero-width assertion, and a character class has to match exactly one character. Some regex flavors would throw an exception for the invalid escape sequence \B
, but Ruby (or Oniguruma, more likely) lets it slide.
Looks like the interpretation of the lookbehind is that of a range [] and not a group () like lookahead assertions. That possibly means \b is an invalid backspace character and not a word boundary.
"well substring! "[/(?<=^|\A|\s|[^\B])substring!(?=$|\Z|\s|\b)/] #=> substring!
"well substring! "[/(?<=^|\A|\s|[^\B])substring(?=$|\Z|\s|\b)/] #=> substring
"well !substring! "[/(?<=^|\A|\s|[^\B])substring(?=$|\Z|\s|\b)/] #=> substring
"well !substring! "[/(?<=^|\A|\s|[^\B])!substring(?=$|\Z|\s|\b)/] #=> !substring
When all else fails... use a double negative!
Yep, looks like Ruby 1.9.2 dosent support \b with look behind.
ruby-1.9.2-p180 :034 > "See Jeffs book and it seems fine!".gsub(/(?=s\b)(?<=\bJeff)/,"'")
SyntaxError: (irb):34: invalid pattern in look-behind: /(?=s\b)(?<=\bJeff)/
from /home/pratikk/.rvm/rubies/ruby-1.9.2-p136/bin/irb:16:in `<main>'
ruby-1.9.2-p180 :033 > "See Jeffs book and it seems fine!".gsub(/(?=s\b)(?<=Jeff)/,"'")
=> "See Jeff's book and it seems fine!"
精彩评论