Searching a file in 3 different ways
I have been writing a program that searches a file in 3 different ways. But firstly, to choose which search program to use is differentiated in the command line.
For example in the command line I type:
Program 1 search: python file.py 'search_term' 'file-to-be-searched'
program 2 search: python file.py -z 'number' 'search_term' 'file-to-be-searched'
program 3 search: python file.py -x 'sea开发者_JAVA技巧rch_term' 'file-to-be-searched'
All 3 search scripts are in the file.py.
The coding I have so far is:
import re
import sys
#program 1
search_term = sys.argv[1]
f = sys.argv[2]
for line in open(f, 'r'):
if re.search(search_term, line):
print line,
# Program 2
flag = sys.argv[1]
num = sys.argv[2]
search_term = sys.argv[3]
f = sys.argv[4]
#program 3
flag = sys.argv[1]
search_term = sys.argv[2]
f = sys.argv[3]
for line in open(f, 'r'):
if re.match(search_term, line):
print line,
Program 1 works fine thats no problem. Program 2, finds the search-term in the file and prints out a number of lines before and after it defined by the 'number' parameter, but i have no idea about how to do this. Program 3 finds the exact match from the search-term and prints out all the lines after the search_term. re.match is inadequate because it only searches from the beginning of a string it does not consider the rest.
My final problem how would I differentiate between the three programs? using the flags or no flag from the command line?
Any help would be appreciated.
Thanks
First of all you should look at two very useful Python modules:
- fileinput: Iterate over lines from multiple input streams
- optparse: A powerful command line option parser
fileinput will help you read lines from several files and even modify them if you need. You'll program will be much easier to extend and read with these tools
Here is an example:
import fileinput
import optparse
if __name__ == '__main__':
parser = optparse.OptionParser()
parser.add_option("-z", dest="z", help="Description here")
parser.add_option("-x", dest="x", help="Description here")
options, args = parser.parse_args()
search_term = args[0]
for line in fileinput.input(args[1:]):
process(line)
For matching you can use re.search instead of re.match. An example from the docs:
>>> re.match("o", "dog") # No match as "o" is not the first letter of "dog".
>>> re.search("o", "dog") # Match as search() looks everywhere in the string.
<_sre.SRE_Match object at ...>
Edit: answering Jessica's comment
say for example in my in my file i had the words: zoo, zoos and zoological. If i typed zoo as my search type all 3 would be retured rather than just zo0
You could wrap the search term in \b to only match the word for example:
>>> re.search(r'\bzoo\b', 'test zoo')
<_sre.SRE_Match object at 0xb75706e8>
>>> re.search(r'\bzoo\b', 'test zoos')
>>> re.search(r'\bzoo\b', 'test zoological')
\b matches an empty string, but only at the beginning or end of a word.
So in your script you can do this:
searchterm = r'\b%s\b' % searchterm
Note: the r here is important otherwise you have to escape the '\'
Maybe it's a little to heavy for a short script, but in pythons standard library, you'll encounter the getopt and the more convenient optparse module.
getopt This module helps scripts to parse the command line arguments in sys.argv.
optparse is a more convenient, flexible, and powerful library for parsing command-line options than the old getopt module. optparse uses a more declarative style of command-line parsing: you create an instance of OptionParser, populate it with options, and parse the command line. optparse allows users to specify options in the conventional GNU/POSIX syntax, and additionally generates usage and help messages for you.
精彩评论