using stored variables as regex patterns
is there a way for python to use values stored in variables as patterns in regex?
supposing i have two variables:
begin_tag = '<%marker>'
end_tag = '<%marker/>'
doc = '<html> something here <%marker> and here and here <%marker/> and more here <html>'
how do you extract the text between begin_开发者_如何学Ctag and end_tag?
the tags are determined after parsing another file, so they're not fixed.
Don't use a regex at all. parse html inteligently!
from BeautifulSoup import BeautifulSoup
marker = 'mytag'
doc = '<html>some stuff <mytag> different stuff </mytag> other things </html>'
soup = BeautifulSoup(doc)
print soup.find(marker).renderContents()
Regular expressions are strings. So you can do anything you want to build them: concatenate them (using + operator), interpolation (using % operator), etc. Just concatenate the variables you want to match with the regex you want to use:
begin_tag + ".*?" + end_tag
The only pitfall is when your variables contain characters that might be taken by the regular expression engine to have special meaning. You need to make sure they are escaped properly in that case. You can do this with the re.escape()
function.
The usual caveat ("don't parse HTML with regular expressions") applies.
精彩评论