开发者

on-the-fly output redirection, seeing the file redirection output while the program is still running

If I use a command like this one:

./program >> a.txt &

, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it act开发者_如何学运维ually stops computing. I want to be able to read the redirected output on file while the program is running.

This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.

You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.

l = range(0,100000)
for i in l:
  if i%1000==0:
    print i
  for j in l:
    s = i + j

One can run this with:

./python program.py >> a.txt &

Then cat a.txt .. you will only get results once the script is done computing.


From the stdout manual page:

The stream stderr is unbuffered. The stream stdout is line-buffered when it points to a terminal. Partial lines will not appear until fflush(3) or exit(3) is called, or a new‐line is printed.

Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.

Ways to work around this:

  • Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.

  • Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:

    screen -d -m -L python test.py
    

    would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.

EDIT:

On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:

  • It won't work at all on static executables

  • It's fragile and rather ugly.

  • It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.

  • It's fragile and rather ugly.

  • It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.

  • It's fragile and rather ugly.

  • It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.

  • Did I mention that it's fragile and rather ugly?

That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:

#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>


char *getenv(const char *s) {
    static char *(*getenv_real)(const char *s) = NULL;

    if (getenv_real == NULL) {
        getenv_real = dlsym(RTLD_NEXT, "getenv");

        setlinebuf(stdout);
    }

    return getenv_real(s);
}

Then you compile that file as a shared object:

gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc

Then you set the LD_PRELOAD variable to load it along with your program:

$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out 
0
1000
2000
3000
4000

If you are lucky, your problem will be solved with no unfortunate side-effects.

You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.


If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).

This buffers stdout up to a line:

stdbuf -oL command > output

This disables stdout buffering altogether:

stdbuf -o0 command > output


Have you considered piping to tee?

./program | tee a.txt

However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.


If the program writes to a file, you can read it while it is being written using tail -f a.txt.


Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.

Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.

The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.


The unbuffer command from the expect package does exactly what you are looking for.

$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜