开发者

starting and stopping using a regular expression

In my program I use a regular expression until the word break then I use it again until the word stop. The first part of the program takes the matches and converts it from military time to regular time. The second part divides the military time by a number the user inputs. My code works, but I use my regular expression twice. How could change my program so I only use the regular expression once.

 with open(filename) as text:
        for line in text:
            pattern = re.search(r'((((2)([0-3]))|(([0-1])([0-9])))([0-5])([0-9]))', line)

            if pattern:

            if re.match("BREAK", line):
                break

        for line in text:
            m= re.search(r'((((2)([0-3]))|(([0-1])([0-9])))([0-5])([0-9]))', line)
            if m:

            if re.match("STOP", line):
                br开发者_如何学运维eak   


Firstly, your regex r'((((2)([0-3]))|(([0-1])([0-9])))([0-5])([0-9]))' has a preposterous number of parentheses in it.

Presumably you are not using the capturing groups so created. You appear to want to match HHMM where HH is 00 to 23 and MM is 00 to 59.

r'(2[0-3]|[01][0-9])[0-5][0-9] will do the same job. You can avoid the one remaining capturing group by doing r'(?:2[0-3]|[01][0-9])[0-5][0-9]'.

You may want to avoid spurious matches (e.g. the "2345" in "blah 23456789") by (e.g.) having \b at each end of the pattern.

Here's a replacement for your code:

import re
searcher = re.compile(r'\b(?:2[0-3]|[01][0-9])[0-5][0-9]\b').search
with open(filename) as text:
        for line in text:
            m = searcher(line)
            if m:
                do_something_1(line, m)
            if line.startswith("BREAK"): # equivalent to your code; is that what you really mean??
                break
        for line in text:
            m = searcher(line)
            if m:
                do_something_2(line, m)
            if line.startswith("STOP"): # equivalent to your code; is that what you really mean??
                break   


The simplest is to use

my_re = re.compile("your regex")
my_re.search(some_string)
my_re.search(some_other_string)

That avoids defining the regex twice.

Depending on the contents of the document, you could split on 'BREAK' or match multiple, hard to know without seeing an example or more definition.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜