开发者

Regex issue using ICU regex/regexkitlite

Starting a new question as my other question solved a different issue with the regex.

Here's my regex:

(?i)\\d{1,4}(?<!v(?:ol)?\\.?\\s?)(?![^\\(]*\\))

Regex split up for clarity:

(?i) - case insensitive

\\d{1,4} - a number with 1-4 digits

(?<!v(?:ol)?\\.?\\s?) the number cannot be preceded by 'v', 'v.', 'vol', 'vol.', with or without a space on the end.

(?![^\\(]*\\)) - Number cannot be inside parentheses.

It all works except for the 'vol.' bit.:

@"Words words 342 words (2342) (words 2 words) (words).ext" result 342 - correct.

@"Words - words words (2010) (words 2 words) (words).ext" result nil - correct.

@"words words v34 35.ext" result 34 - incorrect.

@"Words vol.342 343 (1234) (3 words) (desc).ext" result 342 - incorrect.

What am I doing wrong with 开发者_如何学运维my 'vol.' section?


You need to put the lookbehind before the number. Also, you need to add digits as illegal characters inside the lookbehind, or the 4 in v.34 will match. Try

(?i)(?<!v(?:ol)?\\.?\\s*\\d*)\\d{1,4}(?![^(]*\\))

This is expecting (edit: wrongly, as it turns out) that regexkitlite supports infinite repetition inside lookbehind which not many regex flavors do.

A look into the docs shows that it does support finite (but variable) repetition inside lookbehind, and if you are aware that the following will only work if there is at most one space between vol. and the number, then you could try

(?i)(?<!v(?:ol)?\\.?\\s?)(?<!\\d)\\d{1,4}(?![^(]*\\))
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜