on-the-fly output redirection, seeing the file redirection output while the program is still running

2023-02-20 20:22 问答作者：

If I use a command like this one:

./program >> a.txt &

, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it act开发者_如何学运维ually stops computing. I want to be able to read the redirected output on file while the program is running.

This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.

You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.

l = range(0,100000)
for i in l:
  if i%1000==0:
    print i
  for j in l:
    s = i + j

One can run this with:

./python program.py >> a.txt &

Then cat a.txt .. you will only get results once the script is done computing.

From the stdout manual page:

The stream stderr is unbuffered. The stream stdout is line-buffered when it points to a terminal. Partial lines will not appear until fflush(3) or exit(3) is called, or a new‐line is printed.

Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.

Ways to work around this:

Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.
Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:
```
screen -d -m -L python test.py
```
would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.

EDIT:

On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:

It won't work at all on static executables
It's fragile and rather ugly.
It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.
It's fragile and rather ugly.
It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.
It's fragile and rather ugly.
It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.
Did I mention that it's fragile and rather ugly?

That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:

#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>


char *getenv(const char *s) {
    static char *(*getenv_real)(const char *s) = NULL;

    if (getenv_real == NULL) {
        getenv_real = dlsym(RTLD_NEXT, "getenv");

        setlinebuf(stdout);
    }

    return getenv_real(s);
}

Then you compile that file as a shared object:

gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc

Then you set the LD_PRELOAD variable to load it along with your program:

$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out 
0
1000
2000
3000
4000

If you are lucky, your problem will be solved with no unfortunate side-effects.

You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.

If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).

This buffers stdout up to a line:

stdbuf -oL command > output

This disables stdout buffering altogether:

stdbuf -o0 command > output

Have you considered piping to tee?

./program | tee a.txt

However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.

If the program writes to a file, you can read it while it is being written using tail -f a.txt.

Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.

Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.

The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.

The unbuffer command from the expect package does exactly what you are looking for.

$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>

继续阅读：bash

on-the-fly output redirection, seeing the file redirection output while the program is still running

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？