开发者

regular expression to parse option string in python

I can't seem to create the correct regular expression to extract the correct tokens from my string. Padding the beginning of the string with a space generates the correct output, but seems less than optimal:

>>> import re
>>> s = '-edge_0triggered a-b | -level_Sensitive c-d | a-b-c'
>>> re.findall(r'\W(-[\w_]+)',' '+s)
['-edge_0triggered', '-level_Sensitive'] # correct output

Here are some of the regular expressions I've tried, does anyone have a regex suggestion that doesn't i开发者_运维知识库nvolve changing the original string and generates the correct output

>>> re.findall(r'(-[\w_]+)',s)
['-edge_0triggered', '-b', '-level_Sensitive', '-d', '-b', '-c']
>>> re.findall(r'\W(-[\w_]+)',s)
['-level_Sensitive']


r'(?:^|\W)(-\w+)'

\w already includes the underscore.


Change the first qualifier to accept either a beginning anchor or a not-word, instead of only a not-word:

>>> re.findall(r'(?:^|\W)(-[\w_]+)', s)
['-edge_0triggered', '-level_Sensitive']

The ?: at the beginning of the group simply tells the regex engine to not treat that as a group for purposes of results.


You could use a negative-lookbehind:

re.findall(r'(?<!\w)(-\w+)', s)

the (?<!\w) part means "match only if not preceded by a word-character".

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜