开发者

Replace section of text with only knowing the beginning and last word using Python

In Python, it possible to cut out a section of text in a document when you only know the beginning and end words?

For example, using the bill of rights as the sample document, search for "Amendment 3" and remove all the text until you hit "Amendment 4" without actually knowing or caring wh开发者_StackOverflow中文版at text exists between the two end points.

The reason I'm asking is I would like to use this Python script to modify my other Python programs when I upload them to the client's computer -- removing sections of code that exists between a comment that says "#chop-begin" and "#chop-end". I do not want the client to have access to all of the functions without paying for the better version of the code.


You can use Python's re module.

I wrote this example script for removing the sections of code in file:

import re

# Create regular expression pattern
chop = re.compile('#chop-begin.*?#chop-end', re.DOTALL)

# Open file
f = open('data', 'r')
data = f.read()
f.close()

# Chop text between #chop-begin and #chop-end
data_chopped = chop.sub('', data)

# Save result
f = open('data', 'w')
f.write(data_chopped)
f.close()


With data.txt

do_something_public()

#chop-begin abcd
get_rid_of_me() #chop-end

#chop-beginner this should stay!

#chop-begin
do_something_private()
#chop-end   The rest of this comment should go too!

but_you_need_me()  #chop-begin  
last_to_go()
#chop-end

the following code

import re

class Chopper(object):
    def __init__(self, start='\\s*#ch'+'op-begin\\b', end='#ch'+'op-end\\b.*?$'):
        super(Chopper,self).__init__()
        self.re = re.compile('{0}.*?{1}'.format(start,end), flags=re.DOTALL+re.MULTILINE)

    def chop(self, s):
        return self.re.sub('', s)

    def chopFile(self, infname, outfname=None):
        if outfname is None:
            outfname = infname

        with open(infname) as inf:
            data = inf.read()

        with open(outfname, 'w') as outf:
            outf.write(self.chop(data))

ch = Chopper()
ch.chopFile('data.txt')

results in data.txt

do_something_public()

#chop-beginner this should stay!

but_you_need_me()


Use regular expressions:

import re

string = re.sub('#chop-begin.*?#chop-end', '', string, flags=re.DOTALL)

.*? will match all between.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜