Replace section of text with only knowing the beginning and last word using Python
In Python, it possible to cut out a section of text in a document when you only know the beginning and end words?
For example, using the bill of rights as the sample document, search for "Amendment 3" and remove all the text until you hit "Amendment 4" without actually knowing or caring wh开发者_StackOverflow中文版at text exists between the two end points.
The reason I'm asking is I would like to use this Python script to modify my other Python programs when I upload them to the client's computer -- removing sections of code that exists between a comment that says "#chop-begin" and "#chop-end". I do not want the client to have access to all of the functions without paying for the better version of the code.
You can use Python's re
module.
I wrote this example script for removing the sections of code in file:
import re
# Create regular expression pattern
chop = re.compile('#chop-begin.*?#chop-end', re.DOTALL)
# Open file
f = open('data', 'r')
data = f.read()
f.close()
# Chop text between #chop-begin and #chop-end
data_chopped = chop.sub('', data)
# Save result
f = open('data', 'w')
f.write(data_chopped)
f.close()
With data.txt
do_something_public()
#chop-begin abcd
get_rid_of_me() #chop-end
#chop-beginner this should stay!
#chop-begin
do_something_private()
#chop-end The rest of this comment should go too!
but_you_need_me() #chop-begin
last_to_go()
#chop-end
the following code
import re
class Chopper(object):
def __init__(self, start='\\s*#ch'+'op-begin\\b', end='#ch'+'op-end\\b.*?$'):
super(Chopper,self).__init__()
self.re = re.compile('{0}.*?{1}'.format(start,end), flags=re.DOTALL+re.MULTILINE)
def chop(self, s):
return self.re.sub('', s)
def chopFile(self, infname, outfname=None):
if outfname is None:
outfname = infname
with open(infname) as inf:
data = inf.read()
with open(outfname, 'w') as outf:
outf.write(self.chop(data))
ch = Chopper()
ch.chopFile('data.txt')
results in data.txt
do_something_public()
#chop-beginner this should stay!
but_you_need_me()
Use regular expressions:
import re
string = re.sub('#chop-begin.*?#chop-end', '', string, flags=re.DOTALL)
.*?
will match all between.
精彩评论