Passing meta-characters to Python as arguments from command line
I'm making a Python program that will parse the fields in some input lines. I'd like to let the user enter the field separator as an option from the command line. I'm using optparse
to do this. I'm running into the problem that entering something like \t
will separate literally on \t
, rather than on a tab, which is what I want. I'm pretty sure this is a Python thing and not the shell, since I've tried every combo of quotes, bac开发者_StackOverflowkslashes, and t
's that I can think of.
If I could get optparse
to let the argument be plain input (is there such a thing?) rather than raw_input
, I think that would work. But I have no clue how to do that.
I've also tried various substitutions and regex tricks to turn the string from the two character "\t"
into the one character tab, but without success.
Example, where input.txt
is:
field 1[tab]field\t2
(Note: [tab]
is a tab character and field\t2
is an 8 character string)
parseme.py:
#!/usr/bin/python
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", "--delimiter", action="store", type="string",
dest="delimiter", default='\t')
parser.add_option("-f", dest="filename")
(options, args) = parser.parse_args()
Infile = open(options.filename, 'r')
Line = Infile.readline()
Fields = Line.split(options.delimiter)
print Fields[0]
print options.delimiter
Infile.close()
This gives me:
$ parseme.py -f input.txt
field 1
[tab]
Hey, great, the default setting worked properly. (Yes, I know I could just make \t the default and forget about it, but I'd like to know how to deal with this type of problem.)
$ parseme.py -f input.txt -d '\t'
field 1[tab]field
\t
This is not what I want.
>>> r'\t\n\v\r'.decode('string-escape')
'\t\n\x0b\r'
The quick and dirty way is to to eval
it, like this:
eval(options.delimiter, {}. {})
The extra empty dicts are there to prevent accidental clobbering of your program.
solving it from within your script:
options.delimiter = re.sub("\\\\t","\t",options.delimiter)
you can adapt the re about to match more escaped chars (\n, \r, etc)
another way to solve the problem outside python:
when you call your script from shell, do it like this:
parseme.py -f input.txt -d '^V<tab>'
^V means "press Ctrl+V"
then press the normal tab key
this will properly pass the tab character to your python script;
The callback
option is a good way to handle tricky cases:
parser.add_option("-d", "--delimiter", action="callback", type="string",
callback=my_callback, default='\t')
with the corresponding function (to be defined before the parser, then):
def my_callback(option, opt, value, parser):
val = value
if value == '\\t':
val = '\t'
elif value == '\\n':
val = '\n'
parser.values.delimiter = val
You can check this works via the command line: python test.py -f test.txt -d \t
(no quote around the \t
, they're useless).
It has the advantage of handling the option via the 'optparse' module, not via post-processing the parsing results.
精彩评论