开发者

Join all files in a directory

How can I join all of the files in a directory. I can do it in one step by explicitly naming the files below, is there a way to do it without explicitly naming the files?

join <(\
join <(\
join <(\
join\
<(sort ${rpkmDir}/HS0477.chsn.rpkm)\
<(sort ${rpkmDir}/HS0428.chsn.rpkm) )\
<(sort ${rpkmDir}/HS0419.chsn.rpkm) )\
<(sort ${rpkmDir}/HS0299.chsn.rpkm) )\
<(sor开发者_如何学编程t ${rpkmDir}/HS0445.chsn.rpkm)


#!/bin/bash

data=
for f in "${rpkmDir}"/HS*.chsn.rpkm
do
  if [ ! "$data" ]
  then
    data="$(sort "$f")"
    continue
  fi
  data="$(join <(sort "$f") /dev/stdin <<< "$data")"
done
echo "$data"


use awk, say you want to join on 1st field

awk '{a[$1]=a[$1] FS $0}END{for(i in a) print i,a[i]}' file*


Since the join (in Classic UNIX and under POSIX) is defined so it works on strictly two files at a time, you are going to have to do the iteration yourself, somehow.

While your notation is marvellously minimal, it is also inscrutable. The chances are that you can use pipes and the fact that '-' as a file name denotes standard input to alter the sequencing, I think. But the hard part is connecting everything together without creating any explicit temporary files. You may be best off simply writing a script that writes your script notation, and feeds that into bash.

Maybe (untested script):

cd ${rpkmDir}
ls HS*.chsn.rpkm |
{
read file
script="sort $file"
while read file
do
    script="$script | join - <(sort $file)"
done
} | bash


You can do it by cat ./* >outfile

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜