开发者

cat file | ... vs ... <file

Is开发者_开发知识库 there a case of ... or context where cat file | ... behaves differently than ... <file?


When reading from a regular file, cat is in charge of reading the data, performs it as it pleases, and might constrain it in the way it writes it to the pipeline. Obviously, the contents themselves are preserved, but anything else could be tainted. For example: block size and data arrival timing. Additionally, the pipe in itself isn't always neutral: it serves as an additional buffer between the input and ....

Quick and easy way to make the block size issue apparent:

$ cat large-file | pv >/dev/null
5,44GB 0:00:14 [ 393MB/s] [              <=>                                  ]
$ pv <large-file >/dev/null
5,44GB 0:00:03 [1,72GB/s] [=================================>] 100%


Besides the thing posted by other users, when using input redirection from a file, standard input is the file but when piping the output of cat to the input, standard input is a stream with the contents of the file. When standard input is the file will be able to seek within the file but the pipe will not allow it. You can see this by finding a zip file and running the following commands:

zipinfo /dev/stdin < thezipfile.zip

and

cat thezipfile.zip | zipinfo /dev/stdin

The first command will show the contents of the zipfile while the second will show an error, though it is a misleading error because zipinfo does not check the result of the seek call and errors later on.


A useless use of cat is always to be avoided. It's like driving with the handbrake on. It wastes CPU cycles for nothing, the OS constantly context switching between the cat process and the next in the pipe. If all the world's useless cats were gone and stopped being invented, reinvented, passed on from father to son, we wouldn't have global warming because we could easily live with 1.21 Gigawatts of power saved.

Thanks. I feel better now. Please join me in my crusade to stamp out useless use of cat on stackoverflow. This site is, as far as I perceive it, a major contribution to the proliferation of useless cats. I don't blame the newbies, but I do want to teach them. Workers and newbies of the world, loosen the handbrakes and save the planet!!!1!


cat will allow you to pipe multiple files in sequentially. Otherwise, < redirection and cat file | produce the same side effects.


Pipes cause a subshell to be invoked for the command on the right. This interferes with environment variables.

cat foo | while read line
do
  ...
done
echo "$line"

versus

while read line
do
  ...
done < foo
echo "$line"


One further difference is behavior on a blocking open() of the input file.

For example, assuming input is a FIFO with no writers, one invocation will not spawn any child programs until the input file is opened, while the other will spawn two processes:

prog ... < a_fifo      # 'prog' not launched until shell can open file
cat a_fifo | prog ...  # 'prog' and 'cat' are running (latter may block on open)

In practice this rarely matters except in contrived circumstances. prog might periodically log or do some cleanup work while waiting for input, for example, which you might want to happen even if no input is available. (Why wouldn't prog be sophisticated enough to open its own input fifo nonblocking?)


cat file | starts up another program (cat) that doesn't have to start in the second case. It also makes it more confusing if you want to use "here documents". But it should behave the same.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜