开发者

Searching a file in 3 different ways

I have been writing a program that searches a file in 3 different ways. But firstly, to choose which search program to use is differentiated in the command line.

For example in the command line I type:

Program 1 search: python file.py 'search_term' 'file-to-be-searched'

program 2 search: python file.py -z 'number' 'search_term' 'file-to-be-searched'

program 3 search: python file.py -x 'sea开发者_JAVA技巧rch_term' 'file-to-be-searched'

All 3 search scripts are in the file.py.

The coding I have so far is:

import re
import sys
#program 1
search_term = sys.argv[1]
f = sys.argv[2]

for line in open(f, 'r'):
    if re.search(search_term, line):
     print line,

# Program 2
flag = sys.argv[1]
num = sys.argv[2]
search_term = sys.argv[3]
f = sys.argv[4]

#program 3
flag = sys.argv[1]
search_term = sys.argv[2]
f = sys.argv[3]

for line in open(f, 'r'):
 if re.match(search_term, line):
  print line,

Program 1 works fine thats no problem. Program 2, finds the search-term in the file and prints out a number of lines before and after it defined by the 'number' parameter, but i have no idea about how to do this. Program 3 finds the exact match from the search-term and prints out all the lines after the search_term. re.match is inadequate because it only searches from the beginning of a string it does not consider the rest.

My final problem how would I differentiate between the three programs? using the flags or no flag from the command line?

Any help would be appreciated.

Thanks


First of all you should look at two very useful Python modules:

  • fileinput: Iterate over lines from multiple input streams
  • optparse: A powerful command line option parser

fileinput will help you read lines from several files and even modify them if you need. You'll program will be much easier to extend and read with these tools

Here is an example:

import fileinput
import optparse

if __name__ == '__main__':
    parser = optparse.OptionParser()
    parser.add_option("-z", dest="z", help="Description here")
    parser.add_option("-x", dest="x", help="Description here")
    options, args = parser.parse_args()
    search_term = args[0]
    for line in fileinput.input(args[1:]):
        process(line)

For matching you can use re.search instead of re.match. An example from the docs:

>>> re.match("o", "dog")  # No match as "o" is not the first letter of "dog".
>>> re.search("o", "dog") # Match as search() looks everywhere in the string.
<_sre.SRE_Match object at ...>

Edit: answering Jessica's comment

say for example in my in my file i had the words: zoo, zoos and zoological. If i typed zoo as my search type all 3 would be retured rather than just zo0

You could wrap the search term in \b to only match the word for example:

>>> re.search(r'\bzoo\b', 'test zoo')
<_sre.SRE_Match object at 0xb75706e8>
>>> re.search(r'\bzoo\b', 'test zoos')
>>> re.search(r'\bzoo\b', 'test zoological')

\b matches an empty string, but only at the beginning or end of a word.

So in your script you can do this:

searchterm = r'\b%s\b' % searchterm

Note: the r here is important otherwise you have to escape the '\'


Maybe it's a little to heavy for a short script, but in pythons standard library, you'll encounter the getopt and the more convenient optparse module.

getopt This module helps scripts to parse the command line arguments in sys.argv.

optparse is a more convenient, flexible, and powerful library for parsing command-line options than the old getopt module. optparse uses a more declarative style of command-line parsing: you create an instance of OptionParser, populate it with options, and parse the command line. optparse allows users to specify options in the conventional GNU/POSIX syntax, and additionally generates usage and help messages for you.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜