开发者

regexes: How to access multiple matches of a group? [duplicate]

This question already has answers here: 开发者_如何转开发 RegEx with multiple groups? (4 answers) Closed 5 years ago.

I am putting together a fairly complex regular expression. One part of the expression matches strings such as '+a', '-57' etc. A + or a - followed by any number of letters or numbers. I want to match 0 or more strings matching this pattern.

This is the expression I came up with:

([\+-][a-zA-Z0-9]+)*

If I were to search the string '-56+a' using this pattern I would expect to get two matches:

+a and -56

However, I only get the last match returned:

>>> m = re.match("([\+-][a-zA-Z0-9]+)*", '-56+a')
>>> m.groups()
('+a',)

Looking at the python docs I see that:

If a group matches multiple times, only the last match is accessible:

>>> m = re.match(r"(..)+", "a1b2c3")  # Matches 3 times.
>>> m.group(1)                        # Returns only the last match.
'c3'

So, my question is: how do you access multiple group matches?


Drop the * from your regex (so it matches exactly one instance of your pattern). Then use either re.findall(...) or re.finditer (see here) to return all matches.

Update:

It sounds like you're essentially building a recursive descent parser. For relatively simple parsing tasks, it is quite common and entirely reasonable to do that by hand. If you're interested in a library solution (in case your parsing task may become more complicated later on, for example), have a look at pyparsing.


The regex module fixes this, by adding a .captures method:

>>> m = regex.match(r"(..)+", "a1b2c3")
>>> m.captures(1)
['a1', 'b2', 'c3']
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜