Is the order that tee prints to stdout guaranteed?

2023-01-05 16:00 问答作者：

You can split a pipe using the tee command under linux as follows

printf "line1\nline2\nline3\n" | tee >(wc -l ) | (awk '{print "this is awk: "$0}')

which yields the output

this is awk: line1
this is awk: line2
this is awk: line3
this is awk: 3

My question, is that order of printing guaranteed? Will the tee split pipe that counts the number of lines always print at the end? Is there a way to always print it at the start? Or is the order of printing tee nev开发者_JAVA百科er guaranteed?

It is not defined by tee, but as Daenyth says, wc won't be finished until tee has finished passing it data - so usually tee will have passed it on to awk by then too. In this instance it might be better to have awk do the counting.

echo -ne {one,two,three,four}\\n | \
awk '{print "awk processing line " NR ": "$0} END {print "Awk saw " NR " lines"}'

The downside being that it won't know the number untils it finishes (knowing it requires buffering the data). In your example, both tee and wc have stdout connected to the same pipe (stdin for awk), but the order is undefined. cat (and most other piping tools) can be used to assemble files in a known order.

There are more advanced piping techniques that could be used, such as bash coprocesses (coproc) or named pipes (mkfifo or mknod p). The latter gets you names in the filesystem, which can be passed to other processes, but you'll have to clean them up and avoid collissions. tempfile or $$ may be useful for that. Pipes are not for buffering data, as they often have limited size and will simply block writes.

An example of where pipes are the wrong solution:

mkfifo wcin wcout
wc -l < wcin > wcout &
yes | dd count=1 bs=8M | tee wcin | cat -n wcout - | head

The problem here is that tee will get stuck trying to write things to cat, which wants to finish with wcout first. There's simply too much data for the pipe from tee to cat.

Edit regarding dmckee's answer: Yes, the order may be repeatable, but it is not guaranteed. It is a matter of scale, scheduling and buffer sizes. On this GNU/Linux box, the example starts breaking up after a few thousand lines:

seq -f line%g 20000 | tee >(awk '{print "*" $0 "*"}' ) | \
(awk '{print "this is awk: "$0}') | less
this is awk: line2397
this is awk: line2398
this is awk: line2*line1*
this is awk: *line2*
this is awk: *line3*

I suspect that in this case, wc is waiting for EOF, and so it will not return (or print output) until the first command is done sending input, whereas awk acts line by line and so will always print first. I don't know if it's defined when sending to other processes.

Why not just have awk count the lines before printing the lines themselves?

~~I don't think that you can count on it. The wc here runs in a separate process, so there is no synchronization.~~ My trial run suggests that it might be (at least in bash). As Daenyth explains, this particular case is special, but try it with grep -o line instead of wc and see what you get.

That said, on my MacBoox I get:

$ printf "line1\nline2\nline3\nline4\nline5\n" | tee >(grep -o line ) | (awk '{print "this is awk: "$0}')
this is awk: line1
this is awk: line2
this is awk: line3
this is awk: line4
this is awk: line5
this is awk: line
this is awk: line
this is awk: line
this is awk: line
this is awk: line

very consistently. I'd have to read the bash man page very closely to be sure.

Similarly:

$ printf "line1\nline2\nline3\nline4\nline5\n" | tee >(awk '{print "*" $0 "*"}' ) | (awk '{print "this is awk: "$0}')
this is awk: line1
this is awk: line2
this is awk: line3
this is awk: line4
this is awk: line5
this is awk: *line1*
this is awk: *line2*
this is awk: *line3*
this is awk: *line4*
this is awk: *line5*

everytime...and

$ printf "line1\nline2\nline3\nline4\nline5\n" | tee >(awk '{print "*" $0 "*"}' ) | (grep line)
line1
line2
line3
line4
line5
*line1*
*line2*
*line3*
*line4*
*line5*

继续阅读：tee

Is the order that tee prints to stdout guaranteed?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？