How to align 3 files based on first column value
I have 3 text files c.dat
, n.dat
, and h.dat
The contents are similar, in this format
c.dat n.dat h.dat
1 0.ccc 3 1.nnn 1 2.hhh
2 0.ccc 4 1开发者_运维百科.nnn 2 2.hhh
4 0.ccc 5 1.nnn 5 2.hhh
Desired output:
1 0.ccc Inf 2.hhh
2 0.ccc Inf 2.hhh
3 Inf 1.nnn Inf
4 0.ccc 1.nnn Inf
5 Inf 1.nnn 2.hhh
6 Inf Inf Inf
7 ....
Each file has ~100 rows, but they don't always start from 1, and don't aren't always consecutive.
I need to align the 3 files by the first column, such that if the other files don't have it, it's filled in with something like NA, or NaN, or Inf... anything.
Thanks!
awk '
{
if(FNR==1){f++}
a[$1,f] = $2
if($1 > max){max = $1}
}
END{
for(j=1;j<=max;j++){
printf("%d\t", j)
for(i=1;i<=f;i++){
if(!a[j,i]){printf("Inf\t")}
else{printf("%s\t", a[j,i])}
}
printf("\n")
}
}' ./c.dat ./n.dat ./h.dat
Output
$ ./awk.dat
1 0.ccc Inf 2.hhh
2 0.ccc Inf 2.hhh
3 Inf 1.nnn Inf
4 0.ccc 1.nnn Inf
5 Inf 1.nnn 2.hhh
Kind of a dirty solution -- I created a perl script to loop from 1 to 100, preforming 3 grep
statements, piping to awk
and if that returns blank, then print Inf.
Still looking for a more elegant solution.
Pure Bash.
maxindex=0
while read idx val ; do # build array from c.dat
c[idx]=$val
[ $maxindex -lt $idx ] && maxindex=$idx
done < 'c.dat'
while read idx val ; do # build array from n.dat
n[idx]=$val
[ $maxindex -lt $idx ] && maxindex=$idx
done < 'n.dat'
while read idx val ; do # build array from h.dat
h[idx]=$val
[ $maxindex -lt $idx ] && maxindex=$idx
done < 'h.dat'
for (( idx=1; idx<=$maxindex; idx+=1 )); do
echo -e "$idx ${c[idx]:-INF} ${n[idx]:-INF} ${h[idx]:-INF}"
done
man paste
should give you the answer - paste -d ' ' file1 file2 file3
精彩评论