Python select() behavior is strange
I'm having s开发者_如何学JAVAome trouble understanding the behavior of select.select. Please consider the following Python program:
def str_to_hex(s):
def dig(n):
if n > 9:
return chr(65-10+n)
else:
return chr(48+n)
r = ''
while len(s) > 0:
c = s[0]
s = s[1:]
a = ord(c) / 16
b = ord(c) % 16
r = r + dig(a) + dig(b)
return r
while True:
ans,_,_ = select.select([sys.stdin],[],[])
print ans
s = ans[0].read(1)
if len(s) == 0: break
print str_to_hex(s)
I have saved this to a file "test.py". If invoke it as follows:
echo 'hello' | ./test.py
then I get the expected behavior: select never blocks and all of the data is printed; the program then terminates.
But if I run the program interactively, I get a most undesirable behavior. Please consider the following console session:
$ ./test.py
hello
[<open file '<stdin>', mode 'r' at 0xb742f020>]
68
The program then hangs there; select.select is now blocking again. It is not until I provide more input or close the input stream that the next character (and all of the rest of them) are printed, even though there are already characters waiting! Can anyone explain this behavior to me? I am seeing something similar in a stream tunneling program I have written and it's wrecking the entire affair.
Thanks for reading!
The read
method of sys.stdin
works at a higher level of abstraction than select
. When you do ans[0].read(1)
, python actually reads a larger number of bytes from the operating system and buffers them internally. select
is not aware of this extra buffering; It only sees that everything has been read, and so will block until either an EOF or more input arrives. You can observe this behaviour by running something like strace -e read,select python yourprogram.py
.
One solution is to replace ans[0].read(1)
with os.read(ans[0].fileno(), 1)
. os.read
is a lower level interface without any buffering between it and the operating system, so it's a better match for select
.
Alternatively, running python
with the -u
commandline option also seems to disable the extra buffering.
It's waiting for you to signal EOF (you can do this with Ctrl+D when used interactively). You can use sys.stdin.isatty()
to check if the script is being run interactively, and handle it accordingly, using say raw_input
instead. I also doubt you need to use select.select
at all, why not just use sys.stdin.read
?
if sys.stdin.isatty():
while True:
for s in raw_input():
print str_to_hex(s)
else:
while True:
for s in sys.stdin.read(1):
print str_to_hex(s)
Which would make it appropriate for both interactive use, and for stream processing.
精彩评论