Python: Regex to find but not include an alphanumeric
Is there an regular expressio开发者_StackOverflown to find, for example, ">ab"
but do not include ">"
in the result?
I want to replace some strings using re.sub
, and I want to find strings starting with ">"
without remove the ">"
.
You want a positive lookbehind assertion. See the docs.
r'(?<=>)ab'
It needs to be a fixed length expression, it can't be a variable number of characters. Basically, do
r'(?<=stringiwanttobebeforethematch)stringiwanttomatch'
So, an example:
import re
# replace 'ab' with 'e' if it has '>' before it
#here we've got '>ab' so we'll get '>ecd'
print re.sub(r'(?<=>)ab', 'e', '>abcd')
#here we've got 'ab' but no '>' so we'll get 'abcd'
print re.sub(r'(?<=>)ab', 'e', 'abcd')
You can use a back reference in sub:
import re
test = """
>word
>word2
don't replace
"""
print re.sub('(>).*', r'\1replace!', test)
Outputs:
>replace!
>replace!
don't replace
I believe this accomplishes what you actually want when you say "I want to replace some strings using re.sub
, and I want to find strings starting with '>
' without remove the '>
'."
if you want to avoid using the re module you can also use the startswith() string method.
>>> foo = [ '>12', '>54', '34' ]
>>> for line in foo:
... if line.startswith('>'):
... line = line.strip('>')
... print line
...
12
54
34
>>>
精彩评论