开发者

Python: Regex to find but not include an alphanumeric

Is there an regular expressio开发者_StackOverflown to find, for example, ">ab" but do not include ">" in the result?

I want to replace some strings using re.sub, and I want to find strings starting with ">" without remove the ">".


You want a positive lookbehind assertion. See the docs.

r'(?<=>)ab'

It needs to be a fixed length expression, it can't be a variable number of characters. Basically, do

r'(?<=stringiwanttobebeforethematch)stringiwanttomatch'

So, an example:

import re

# replace 'ab' with 'e' if it has '>' before it

#here we've got '>ab' so we'll get '>ecd'
print re.sub(r'(?<=>)ab', 'e', '>abcd') 

#here we've got 'ab' but no '>' so we'll get 'abcd'
print re.sub(r'(?<=>)ab', 'e', 'abcd') 


You can use a back reference in sub:

import re
test = """
>word
>word2
don't replace
"""
print re.sub('(>).*', r'\1replace!', test)

Outputs:

>replace!
>replace!
don't replace

I believe this accomplishes what you actually want when you say "I want to replace some strings using re.sub, and I want to find strings starting with '>' without remove the '>'."


if you want to avoid using the re module you can also use the startswith() string method.

>>> foo = [ '>12', '>54', '34' ]
>>> for line in foo:
...     if line.startswith('>'):
...             line = line.strip('>')
...     print line
... 
12
54
34
>>> 
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜