开发者

Nested quantifiers in boost::regex

Is \d++ a valid regular expression in progr开发者_如何转开发amming languages that don't support possessive quantifier?Is it equivalent to (\d+)+?

When testing it in Python,an error sre_constants.error: multiple repeat will be raised.In C#,it will throw a runtime exception:System.ArgumentException: parsing "\d++" - Nested quantifier +.As well as boost::xpressive.

But \d++...+ is considered valid in boost::regex.

wchar_t* s = L"abc123" ;
wregex e(L"\\d+++", boost::regex::normal) ;
wcmatch m ;

if(regex_search(s, m, e)){
    wcout <<  m[0] << endl ;
}

The output is 123.


The above code throws an instance of boost::bad_expression with "Invalid preceding regular expression" for me. Its a redhat linux system compiled with gcc 3.4.6 and boost 1_32.


Without possessive quantifiers, what would \d++ (or (\d+)+) mean?

Let's assume it was a valid syntax, and we could read it as "one or more (one or more digit)". In that case, we'd still be able to reduce the expression to \d+ (\d+ matches a single digit, so (\d+)+ could be simplified to (\d)+, which still matches one or more digits) . Therefore, \d++ would be redundant.

I am not aware of any regular expression engine wither \d++ is valid syntax, aside from engines that support possessive quantifiers.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜