开发者

regex help using repreated groups

I'm trying to match rc-update -s output in python.

m = re.match(r"^\s*(\w+)\s*\|{\s*(\w+)\s*}*$", " network | level1 level2 leveln ")

but m is always None

the hard part for me is getting t开发者_开发百科he regex to match the n levels. I thought that using {}* would match the n levels, but as soon as I add the {} nothing matches.

thanks.


The curly braces ("{}") do not do what you think they do, at least in this example.

You seem to want a non-matching group. With Python's re, the syntax for this is (?:\s*(\w+)\s*), to match your example.

With this change to your example, I get:

>>> m = re.match(r"^\s*(\w+)\s*\|(?:\s*(\w+)\s*)*$", " network | level1 level2 leveln ")
>>> m
<_sre.SRE_Match object at 0x00F217B8>
>>> m.groups()
('network', 'leveln')

Note that the result only contains the last match for the repeated group. If you want to get all of the matches, match the entire expression containing the repetitions, and then parse that to find each of the matches. For example:

>>> m = re.match(r"^\s*(\w+)\s*\|((?:\s*\w+\s*)*)$", " network | level1 level2 leveln ")
>>> m.groups()
('network', ' level1 level2 leveln ')
>>> m.groups()[1].strip().split()
['level1', 'level2', 'leveln']

On a side note, this looks like something that would be much simpler to parse without regexps. As you can see, regexps have a lot of gotchas and become confusing very quickly.


The {} are odd here, they are not meta characters when used this way, what is there purpose because at the moment they are attempting to match a literal { and the match fails.

Replace them with normal parenthesis and it will work

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜