How to merge sorted files without using a temporary file?
I'm trying to merge many sorted files in a UNIX/Linux script with sort -m
, and I noticed that sort
first writes the result to a temporary file, then copies it to destination. My understanding of -m
was that it assumes the files are sorted, so using a temporary file is completely unnecessary, and it wastes both hard disk space and CPU cycles (I'm using sort
in a pipeline which gets stuck waiting for sort to output anything.) Is there a way to tell sort
to not use temporary files when mergin开发者_高级运维g sorted files? Or a better version which doesn't?
The exact CL looks like:
$ sort -m -s -t '_' -k 1,1n -k 2,2n <(gunzip <file_1) [...] <(gunzip <file_n) | gzip >output
I'm using sort
from GNU coreutils 5.97.
Check out these options from man sort
, they might let you minimize the amount of space needed for merging.
--batch-size=NMERGE
merge at most NMERGE inputs at once; for more use temp files
--compress-program=PROG
compress temporaries with PROG; decompress them with PROG -d
Running with GNU coreutils 6.10, I'm not seeing that problem.
One thing about the command line that you're using is that the <(...) redirection writes the input to a temporary file before starting the command. Could that be the delay you are seeing?
I ran this command:
sort -m a b c d e f g h i j | more
and it did not create a temp file for the output. I piped the output into more so it would block and then looked in /proc to see what sort was doing. It had all of the input files opened, and the pipe to the more command, but that was it. No temporary file:
$ ls -l /proc/1308/fd
total 0
lrwx------ 1 brianb brianb 64 2014-06-24 18:50 0 -> /dev/pts/0
l-wx------ 1 brianb brianb 64 2014-06-24 18:50 1 -> pipe:[217016034]
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 10 -> /home/brianb/h
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 11 -> /home/brianb/i
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 12 -> /home/brianb/j
lrwx------ 1 brianb brianb 64 2014-06-24 18:50 2 -> /dev/pts/0
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 3 -> /home/brianb/a
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 4 -> /home/brianb/b
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 5 -> /home/brianb/c
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 6 -> /home/brianb/d
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 7 -> /home/brianb/e
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 8 -> /home/brianb/f
lr-x------ 1 brianb brianb 64 2014-06-24 18:50 9 -> /home/brianb/g
精彩评论