How to substitute into a regular expression group in Python
>>> s = 'foo: "apples", bar: "oranges"'
>>> pattern = 'foo: "(.*)"'
I want to be able to substitute into the group like this:
>>开发者_如何学Python;> re.sub(pattern, 'pears', s, group=1)
'foo: "pears", bar: "oranges"'
Is there a nice way to do this?
For me works something like:
rx = re.compile(r'(foo: ")(.*?)(".*)')
s_new = rx.sub(r'\g<1>pears\g<3>', s)
print(s_new)
Notice ?
in re, so it ends with first "
, also notice "
in groups 1 and 3 because they must be in output.
Instead of \g<1>
(or \g<number>
) you can use just \1
, but remember to use "raw" strings and that g<1>
form is preffered because \1
could be ambiguous (look for examples in Python doc) .
re.sub(r'(?<=foo: ")[^"]+(?=")', 'pears', s)
The regex matches a sequence of chars that
- Follows the string
foo: "
, - doesn't contain double quotation marks and
- is followed by
"
(?<=)
and (?=)
are lookbehind and lookahead
This regex will fail if the value of foo
contains escaped quots. Use the following one to catch them too:
re.sub(r'(?<=foo: ")(\\"|[^"])+(?=")', 'pears', s)
Sample code
>>> s = 'foo: "apples \\\"and\\\" more apples", bar: "oranges"'
>>> print s
foo: "apples \"and\" more apples", bar: "oranges"
>>> print re.sub(r'(?<=foo: ")(\\"|[^"])+(?=")', 'pears', s)
foo: "pears", bar: "oranges"
精彩评论