Why do I have to press Ctrl+D twice to close stdin?
I have the following Python script that reads numbers and outputs an error if the input is not a number.
import fileinput
import sys
for line in (txt.strip() for txt in fileinput.input()):
if not line.isdigit():
sys.stderr.write("ERROR: not a number: %s\n" % line)
If I get the input from stdin, I have to press Ctrl + D twice to end the program. Why?
I only have to press Ctrl + D once when I run 开发者_运维技巧the Python interpreter by itself.
bash $ python test.py
1
2
foo
4
5
<Ctrl+D>
ERROR: not a number: foo
<Ctrl+D>
bash $
In Python 3, this was due to a bug in Python's standard I/O library. The bug was fixed in Python 3.3.
In a Unix terminal, typing Ctrl+D doesn't actually close the process's stdin. But typing either Enter or Ctrl+D does cause the OS read
system call to return right away. So:
>>> sys.stdin.read(100)
xyzzy (I press Enter here)
(I press Ctrl+D once)
'xyzzy\n'
>>>
sys.stdin.read(100)
is delegated to sys.stdin.buffer.read
, which calls the system read() in a loop until either it accumulates the full requested amount of data; or the system read() returns 0 bytes; or an error occurs. (docs) (source)
Pressing Enter after the first line caused the system read() to return 6 bytes. sys.stdin.buffer.read
called read() again to try to get more input. Then I pressed Ctrl+D, causing read() to return 0 bytes. At this point, sys.stdin.buffer.read
gave up and returned just the 6 bytes it had collected earlier.
Note that the process still has my terminal on stdin, and I can still type stuff.
>>> sys.stdin.read() (note I can still type stuff to python)
xyzzy (I press Enter)
(Press Ctrl+D again)
'xyzzy\n'
OK. This is the part that was busted when this question was originally asked. It works now. But prior to Python 3.3, there was a bug.
The bug was a little complicated --- basically the problem was that two separate layers were doing the same work. BufferedReader.read()
was written to call self.raw.read()
repeatedly until it returned 0 bytes. However, the raw method, FileIO.read()
, performed a loop-until-zero-bytes of its own. So the first time you press Ctrl+D in a Python with this bug, it would cause FileIO.read()
to return 6 bytes to BufferedReader.read()
, which would then immediately call self.raw.read()
again. The second Ctrl+D would cause that to return 0 bytes, and then BufferedReader.read()
would finally exit.
This explanation is unfortunately much longer than my previous one, but it has the virtue of being correct. Bugs are like that...
Most likely this has to do with Python the following Python issues:
- 5505:
sys.stdin.read()
doesn't return after first EOF on Windows, and - 1633941:
for line in sys.stdin:
doesn't notice EOF the first time.
I wrote an explanation about this in my answer to this question.
How to capture Control+D signal?
In short, Control-D at the terminal simply causes the terminal to flush the input. This makes the read
system call return. The first time it returns with a non-zero value (if you typed something). The second time, it returns with 0, which is code for "end of file".
The first time it considers it to be input, the second time it's for keeps!
This only occurs when the input is from a tty. It is likely because of the terminal settings where characters are buffered until a newline (carriage return) is entered.
Using the "for line in file:" form of reading lines from a file, Python uses a hidden read-ahead buffer (see http://docs.python.org/2.7/library/stdtypes.html#file-objects at the file.next function). First of all, this explains why a program that writes output when each input line is read displays no output until you press CTRL-D. Secondly, in order to give the user some control over the buffering, pressing CTRL-D flushes the input buffer to the application code. Pressing CTRL-D when the input buffer is empty is treated as EOF.
Tying this together answers the original question. After entering some input, the first ctrl-D (on a line by itself) flushes the input to the application code. Now that the buffer is empty, the second ctrl-D acts as End-of-File (EOF).
file.readline()
does not exhibit this behavior.
精彩评论